nanog mailing list archives
Re: interesting troubleshooting
From: Matthew Petach <mpetach () netflight com>
Date: Fri, 20 Mar 2020 15:23:19 -0700
On Fri, Mar 20, 2020 at 3:09 PM Saku Ytti <saku () ytti fi> wrote:
Hey Nimrod,I was contacted by my NOC to investigate a LAG that was not distributingtraffic evenly among the members to the point where one member was congested while the utilization on the LAG was reasonably low. Looking at my netflow data, I was able to confirm that this was caused by a single large flow of ESP traffic. Fortunately, I was able to shift this flow to another path that had enough headroom available so that the flow could be accommodated on a single member link.With the increase in remote workers and VPN traffic that won't hashacross multiple paths, I thought this anecdote might help someone else track down a problem that might not be so obvious. This problem is called elephant flow. Some vendors have solution for this, by dynamically monitoring utilisation and remapping the hashResult => egressInt table to create bias to offset the elephant flow. One particular example: https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/adaptive-edit-interfaces-aex-aggregated-ether-options-load-balance.html Ideally VPN providers would be defensive and would use SPORT for entropy, like MPLSoUDP does. -- ++ytti
There are *several* caveats to doing dynamic monitoring and remapping of flows; one of the biggest challenges is that it puts extra demands on the line cards tracking the flows, especially as the number of flows rises to large values. I recommend reading https://www.juniper.net/documentation/en_US/junos/topics/topic-map/load-balancing-aggregated-ethernet-interfaces.html#id-understanding-aggregated-ethernet-load-balancing before configuring it. "Although the feature performance is high, it consumes significant amount of line card memory. Approximately, 4000 logical interfaces or 16 aggregated Ethernet logical interfaces can have this feature enabled on supported MPCs. However, when the Packet Forwarding Engine hardware memory is low, depending upon the available memory, it falls back to the default load balancing mechanism." What is that old saying? Oh, right--There Ain't No Such Thing As A Free Lunch. ^_^;; Matt
Current thread:
- interesting troubleshooting Nimrod Levy (Mar 20)
- Re: interesting troubleshooting Job Snijders (Mar 20)
- Re: interesting troubleshooting Jared Mauch (Mar 20)
- Re: interesting troubleshooting Job Snijders (Mar 20)
- Re: interesting troubleshooting Christopher Morrow (Mar 21)
- Re: interesting troubleshooting Brandon Martin (Mar 24)
- Re: interesting troubleshooting William Herrin (Mar 20)
- Re: interesting troubleshooting Jared Mauch (Mar 20)
- Re: interesting troubleshooting Saku Ytti (Mar 20)
- Re: interesting troubleshooting Matthew Petach (Mar 20)
- Re: interesting troubleshooting Saku Ytti (Mar 21)
- Re: interesting troubleshooting Matthew Petach (Mar 22)
- Re: interesting troubleshooting Matthew Petach (Mar 20)
- Re: interesting troubleshooting Job Snijders (Mar 20)
- Re: interesting troubleshooting Chris Adams (Mar 20)
- Re: interesting troubleshooting Steve Meuse (Mar 20)
- Re: interesting troubleshooting Saku Ytti (Mar 21)
- Re: interesting troubleshooting Mark Tinka (Mar 21)
- Re: interesting troubleshooting Saku Ytti (Mar 21)
- Re: interesting troubleshooting Mark Tinka (Mar 22)
- Re: interesting troubleshooting Saku Ytti (Mar 22)
- Re: interesting troubleshooting Mark Tinka (Mar 22)
- Re: interesting troubleshooting Saku Ytti (Mar 21)