nanog mailing list archives

Re: Level 3's side of the story


From: Leo Bicknell <bicknell () ufp org>
Date: Sat, 8 Oct 2005 12:41:21 -0400


Given that at least part of JC Dill's comments were directly lifted from
an e-mail I sent him, I feel compelled to put them side by side.  JC's
comments:

In a message written on Sat, Oct 08, 2005 at 07:24:06AM -0700, JC Dill wrote:
Consider a simple hypothetical closest-exited network setup (hot potato 
routing) between 2 peers:


ISP Eyeballs:  Router-E1----2,000 Mile Link----Router-E2----Customer
                     |                                 |
                     |                                 |
                   peering                           peering
                      |                                 |
                      |                                 |
ISP Content: Server--Router-C1---2,000 Mile Link-----Router-C2

When the customer on ISP E (Eyeballs) requests content (web page, music 
file, etc.) from the server on ISP C (Content), packets travel like this:

Customer->Router-E2->Router-C2->Router-C1->Server

When the server returns traffic to the customer, traffic goes like this:

Server->Router-C1->Router-E1->Router-E2->Customer

The problem is the customer->server direction would typically be a 500 
byte request and 64 byte ACK packets, where as the server->customer data 
includes many 1500 byte data packets.  So, ISP Eyeballs may carry 2Mbs 
of data over its 2,000 mile link, where as ISP Content will only carry 
128Kbs over its 2,000 mile link.

Even though both companies met in the middle, ISP Content shifted  some 
of its costs to ISP Eyeballs.

Back when most ISPs had the same types of traffic (even mixes of content 
and eyeballs), they had even ratios which equalized this effect, as it 
was happening the same amount in both directions.  But as some ISPs 
started specializing in one type of content or the other, uneven flows 
were produced.  Some bean-counter felt that these uneven flows meant 
that the network that was sending more traffic should now pay for 
transit, even though this traffic was traffic that their own customers 
were requesting and paying them to transmit!

There are ways to deal with it though, like cold potato routing.

My message to JC:

In a message written on Thu, 6 Oct 2005 15:07:05 -0400, Leo Bicknell wrote:
That's not how it works.  Consider a simple, closest exited network
setup:


ISP A:           Router1-------2,000 Mile Link-----Router2----Customer
                    |                                 |
                    |                                 |
                 peering                           peering
                    |                                 |
                    |                                 |
ISP B: Server----RouterA-------2,000 Mile Link-----RouterB

When the "customer" requests a web page from the "server", packets
travel like this:

Customer->Router2->RouterB->RouterA->Server

When the server returns traffic to the customer, traffic goes like this:

Server->RouterA->Router1->Router2->Customer

The problem is the customer->server direction is a 500 byte request,
and 64 byte ACK packets, where as the server->customer data includes
lots of 1500 byte data packets.  So, ISP A may carry 2Mbps of data
over it's 2,000 mile link, where as ISP B will only carry 128k over
it's 2.000 mile link.

Even though both companies met in the middle, ISP B shifted it's costs
to ISP A.

The theory is that having an even ratio equalizes this effect, as it's
happening the same amount in both directions.  There are other ways to
deal with it though, like cold potato routing and other tricks.

In practice it becomes a much more complicated issue, but in many cases
due to how routing works, geographies involved, and others routing
policies (eg, customers not advertising their routes to all providers)
there are very real, very expensive inequalities.  Large ISP's work
together to equalize them to the extent possible.

It's not so much that I mind quoting without attribution, it's that
I mind interleaving original bits with new bits as that can lead
to confusion later least anyone put A and B together.

On the issue at hand, surface it to say costs are a complicated issue.
This is a rather simplified view of the world, and does not take into
consideration many of the things that are going on.

For instance, many content providers buy cheap Cogent bandwidth and
dump traffic on Cogent, but DO NOT advertise ANY prefixes to Cogent,
or at least, not the ones generating traffic.  So in my example if
we assume "customer" is Level 3, and "Server" is someone buying
from Cogent, the path from customer to server may not transit
Cogent's network at all.  Level 3 may send that to say, AT&T, who's
also a service provider for the person with the server.

There's also a lot of other considerations that come down to various
people's choices.  How many people bought ATM cards and had them
be the only ATM cards in their entire network because you needed
them to connect to AADS, Mae-East, and UUNet?  That's extra cost
because of those other entities technology choices.  People with OSR's
at peering points love GigE interconnects, people with GSR's generally
don't like them.  People with GSR's may like OC-3 interconnects, those
with M160's probably hate them.

If all layer 1, layer 2, and layer 3 technology had the same
cost/megabit(/mile) then it would all be the same, but the fact is
it doesn't, and so based on a providers other business assets they
will see very different costs.  In some cases both sides are "right".
Believe me I've been in many discussions that went like "I can only
do GigE right now cuz it's all I can afford", well "I can only do
OC-12 right now because it's all I can afford".

Level 3 could be pulling out of particular peering points, which
happen to be where Cogent is located, and Cogent is not where they
are going to be in the future.  Cogent could have been using localpref
to artificially raise or lower level 3 traffic.  Level 3 may have
wanted to upgrade full circuits that were dropping packets every
day and Cogent refused, causing them to terminate the agreement for
failing to work with them on the problem.

I could come up with a hundred other reasons this happened.  For
better or for worse it is all under NDA I'm sure, and frankly if
it were our NDA I'm not sure what Cogent has said so far would be
acceptable.

I find it rather sad that "engineers" would be trying to solve a
problem when they don't in fact know the true cause of the problem.
All we've seen in a single symptom, down peering.  We have no idea
what has, or has not been said between them for the last few months.
What the graphs look like, what the netflow statistics show.  What
the costs are to both parties, and how much they make on the topic.

It's a wonder the internet works as well as it does, not because
of Level 3 and Cogent partitioning, but because of the lack of clue
on the part of the people who are (supposedly) running it.

-- 
       Leo Bicknell - bicknell () ufp org - CCIE 3440
        PGP keys at http://www.ufp.org/~bicknell/
Read TMBG List - tmbg-list-request () tmbg org, www.tmbg.org

Attachment: _bin
Description:


Current thread: