nanog mailing list archives

Re: Lossy cogent p2p experiences?


From: David Hubbard <dhubbard () dino hostasaurus com>
Date: Mon, 11 Sep 2023 16:14:16 +0000

Some interesting new developments on this, independent of the divergent network equipment discussion. šŸ˜Š

Cogent had a field engineer at the east coast location where my local loop (10gig wave) meets their equipment, i.e. (me 
ā€“ patch cable to loop providerā€™s wave equipment ā€“ wave ā€“ patch cable to Cogent equipment).  On the other end, the 
geographically distant west coast direction, itā€™s Cogent equipment to my equipment in the same facility with just patch 
cable.  They connected some model of EXFOā€™s NetBlazer FTBx 8880-series testing device to a port on their east coast 
network device, not disconnecting my circuit.  Originally, they were planning to have someone physically loop at their 
equipment at the other end, but I volunteered that my Arista gear supports a provider-facing loop at the transceiver 
level if they wanted to try that, so my loop, cabling, and transceiver could be part of the testing.

One direction at a time, they interrupted the point to point config to create a point to point between one direction of 
my gear, set to loopback mode, and the NetBlazer device.  The device was set to use five parallel streams.  In the 
close direction, where the third-party wave is involved, they ran at full 5 x 2gbps for thirty minutes, had zero 
packets lost, no issues.  My monitoring confirmed this rate of port input was occurring, although oddly not output, but 
perhaps Arista doesnā€™t ā€œseeā€/count the retransmitted packets in phy loopback mode.

In the distant direction across their backbone, their equipment at the remote end, and the fiber patch cable to me, 
they tested at 9.5 Gbit for thirty minutes through my device in loopback mode.  The result was, of 2.6B packets sent, 
only 334 packets lost.  They configured for 9.5 gbps rate of testing, so five 1.9gbps streams.  Across the five 
streams, the report has a ā€œframe lossā€ and out of sequence section.  Zero out of sequence, but among the five streams, 
loss seconds / count were 3 / 26, 3 / 48, 1 / 5, 13 / 221, 1 / 34.  Iā€™m not familiar with this testing device, but to 
me that suggests itā€™s stating how many of the total seconds experienced loss, and the counted packet loss.  So really 
the only one that stands out is the one with thirteen seconds where loss occurred, but the packet counts weā€™re talking 
about are miniscule.  Again, my monitoring at the interface level showed this 9.5gbps of testing occurring for the 
thirty minutes the report says.

So, now Iā€™m just completely confused.  How is this device, traversing the same equipment, ports, cables, able to 
achieve far greater average throughput, and almost no loss, across a very long duration?  There are times Iā€™ll be able 
to achieve nearly the same, but never for a test longer than ten seconds as it just falls off from there.  For example, 
I did a five parallel stream TCP test with iperf just now and did achieve a net throughput of 8.16 Gbps with about 1200 
retransmits.  Same five stream test run for half hour like theirs, I got no better than 2.64 Gbps and 183,000 
retransmits.

iperf and UDP allow me to see loss at any rate of transmit exceeding ~140mbps, in just seconds, not a half hour.  To 
rule out my gear, Iā€™m also able to perform the same tests from the same systems (both VM and physical) using public 
addresses and traversing the internet, as these are publicly connected systems.  I get far lower loss and much greater 
throughput on the internet path.  For example, simple ten second test of a single stream at 400 Mbit UDP; 5 packets 
lost across internet, 491 across P2P.  Single stream TCP across the internet for ten seconds; 3.47 Gbps, 162 
retransmits.  Across the P2P, this time at least, 637 Mbps, 3633 retransmits.

David



From: David Hubbard <dhubbard () dino hostasaurus com>
Date: Friday, September 1, 2023 at 10:19 AM
To: Nanog () nanog org <nanog () nanog org>
Subject: Re: Lossy cogent p2p experiences?
The initial and recurring packet loss occurs on any flow of more than ~140 Mbit.  The fact that itā€™s loss-free under 
that rate is what furthers my opinion itā€™s config-based somewhere, even though they say it isnā€™t.

From: NANOG <nanog-bounces+dhubbard=dino.hostasaurus.com () nanog org> on behalf of Mark Tinka <mark@tinka.africa>
Date: Friday, September 1, 2023 at 10:13 AM
To: Mike Hammett <nanog () ics-il net>, Saku Ytti <saku () ytti fi>
Cc: nanog () nanog org <nanog () nanog org>
Subject: Re: Lossy cogent p2p experiences?

On 9/1/23 15:44, Mike Hammett wrote:
and I would say the OP wasn't even about elephant flows, just about a network that can't deliver anything acceptable.

Unless Cogent are not trying to accept (and by extension, may not be able to guarantee) large Ethernet flows because 
they can't balance them across their various core links, end-to-end...

Pure conjecture...

Mark.

Current thread: