nanog mailing list archives

Re: modeling residential subscriber bandwidth demand


From: James Bensley <jwbensley () gmail com>
Date: Thu, 4 Apr 2019 09:40:41 +0100

On Tue, 2 Apr 2019 at 17:57, Tom Ammon <thomasammon () gmail com> wrote:

How do people model and try to project residential subscriber bandwidth demands into the future? Do you base it 
primarily on historical data? Are there more sophisticated approaches that you use to figure out how much backbone 
bandwidth you need to build to keep your eyeballs happy?

Netflow for historical data is great, but I guess what I am really asking is - how do you anticipate the load that 
your eyeballs are going to bring to your network, especially in the face of transport tweaks such as QUIC and TCP BBR?

Tom

Hi Tom,

Historical data is definitely the way to predict a trend, you can’t
call something a trend if it only started today IMO, something (e.g.
bandwidth profiling) needs to have been recorded for a while before
you can say that you are trying to predict the trend. Without
historical data you're just making predications without any direction,
which I don't think you want J

Assuming you have a good mixture of subs, i.e. adults, children, male,
female, different regions etc. and 100% of your subs aren't a single
demographic like university campuses for example; then I don't think
you need to worry about specifics like the adoption of QUIC or BBR.
You will never see a permeant AND massive increase in your total
aggregate network utilisation from one day to the next.

If for example, a large CDN makes a change that increases per-user
bandwidth requirements, it's unlikely they are going to deploy it
globally in one single big-bang change. This would also be just one of
your major bandwidth sources/destinations, of which you'll likely have
several big-hitters that make up the bulk of your traffic. If you have
planned well so far, and have plenty of spare capacity (as others have
mentioned, in the 50-70% range and your backhaul/peering/transit links
are of a reasonable size ratio to your subs, e.g. subs get 10-20Mbps
services and your links are 1Gbps) there should be no persisting risk
to your network capacity as long as you keep following the same
upgrade trajectory. Major social events like the Super Bowl where you
are (or here in England, sunshine) will cause exceptional traffic
increases, but only for brief periods.

You haven't mentioned exactly what you're doing for modelling capacity
demand (assuming that you wanted feedback on it)?

Assuming all the above is true for you, to give us a reasonable
foundation to build on;
In my experience the standard method is to record your ingress traffic
rate at all your PEs or P&T nodes, and essentially divide this by the
number of subs you have (egress is important too, it's just usually
negligible in comparison). For example, if your ASN has a total
average ingress traffic rate of 1Gbps at during peak hours and, you
have 10,000 subs, you can model on say 0.1Mbps per sub. That’s
actually a crazily low figure these days but, it’s just a fictional
example to demonstrate the calculation.

The ideal scenario is that you have this info for as long as you can.
Also, the more subs you have the better it all averages out. For
business ISPs, bringing on 1 new customer can make a major difference,
if it’s a 100Gbps end-site site and your backbone is a single 100Gbps
link you could be in trouble. For residential services, subs almost
always have slower links than your backbone/P&T/PE nodes.

If you have different types of subs it’s also worth breaking down the
stats by sub type. For example; we have ADSL subs and VDSL subs. We
record the egress traffic rate on the BNGs towards each type of sub
separately and then aggregate across all BNGs. For example, today peak
inbound for our ASN was X, of that X, Y went to ADSL subs and Z when
to VDSL subs. Y / $number_of_adsl_subs == peak average for an ADSL
line and, Z / $number_of_vdsl_subs == peak average for a VDSL line.

It’s good to know this difference because a sub migrating from ADSL to
VDSL is not the same as getting a new sub in terms of additional
traffic growth. We have a lot of users upgrading to VDSL which makes a
difference at scale, e.g 10K upgrades is less additional traffic than
10k new subs. Rinse and repeat for you other customer types (FTTP/H,
wireless etc.)


On Tue, Apr 2, 2019 at 2:20 PM Josh Luthman <josh () imaginenetworksllc com> wrote:

We have GB/mo figures for our customers for every month for the last ~10 years.  Is there some simple figure you're 
looking for?  I can tell you off hand that I remember we had accounts doing ~15 GB/mo and now we've got 1500 GB/mo 
at similar rates per month.


I'm mostly just wondering what others do for this kind of planning - trying to look outside of my own experience, so 
I don't miss something obvious. That growth in total transfer that you mention is interesting.

You need to be careful with volumetric based usage figures. As links
continuously increase in speed over the years, users can transfer the
same amount of data in less bit-time. The problem with polling at any
interval (be it 1 seconds or 15 minutes) is that you miss bursts in
between the polls. Volumetric based accounting misses the link
utilisation which is how congestion is identified. You must measure
utilisation and divide that by $number_of_subs. Your links can be
congested and if you only measure by data volume transferred, you’ll
see month by month subs transferred the same amount of data overall
but day by day, hour by hour, it took longer because a link somewhere
is congested, and everyone is pissed off. So with faster end-user
speeds, one may have shorter but high core link utilisation.

I always wonder what the value of trying to predict utilization is anyway, especially since bandwidth is so cheap. 
But I figure it can't hurt to ask a group of people where I am highly likely to find somebody smarter than I am :-)

The main requirement in my opinion is upgrades. You need to know how
long a link upgrade takes for your operational teams, or a device
upgrade etc. If it takes 2 months to deliver a new backhaul link to a
regional PoP, call it 3 months to allow for wayleaves, sick engineers,
DC access failures, etc. Then make sure you trigger a backhaul upgrade
when your growth model says you’re 3-4 months away from 70%
utilisation (or whatever figure suites your operations and customers).

Cheers,
James.

P.S. Sorry for the epistle.


Current thread: