nanog mailing list archives
Re: Soliciting your opinions on Internet routing: A survey on BGP convergence
From: Jared Mauch <jared () puck nether net>
Date: Tue, 10 Jan 2017 15:32:33 -0500
On Jan 10, 2017, at 3:14 PM, Hugo Slabbert <hugo () slabnet com> wrote: On Tue 2017-Jan-10 20:58:02 +0100, Job Snijders <job () instituut net> wrote:On Tue, Jan 10, 2017 at 03:51:04AM +0100, Baldur Norddahl wrote:If a transit link goes, for example because we had to reboot a router, traffic is supposed to reroute to the remaining transit links. Internally our network handles this fairly fast for egress traffic. However the problem is the ingress traffic - it can be 5 to 15 minutes before everything has settled down. This is the time before everyone else on the internet has processed that they will have to switch to your alternate transit. The only solution I know of is to have redundant links to all transits.Alternatively, if you reboot a router, perhaps you could first shutdown the eBGP sessions, then wait 5 to 10 minutes for the traffic to drain away (should be visible in your NMS stats), and then proceed with the maintenance? Of course this only works for planned reboots, not suprise reboots....or link failures.
One other comment: there has been a long history of poorly behaving BGP stacks that would take quite some time to hunt through the paths. While this can still occur with people with nearing ancient software and hardware still in-use, many of the modern software/hardware options enable things like BGP-PIC (in your survey) by default. Many of these options you document as best practices like path mtu discovery are well known fixes for networks, as well as using jumbo mtu internally to obtain 9k+ mss for high performance TCP. Vendors have not always chosen to enable the TCP options by default like the protocols have, eg: BGP-PIC and like Jakob’s response, tout other solutions vs fixing the TCP stack first. Many of these performances were documented in 2002 and are considered best practices by many networks, but due to their obscure knobs may not be widely deployed as a result, or seen as risky to configure. (We had a vendor panic when we discovered a bug in their TCP-SACK code, they were almost frozen in not fixing the code because touching TCP felt dangerous and there was an inadequate testing culture around something seen as ‘stable’). here’s the presentation from IETF 53, I don’t see it in the proceedings handily: http://morse.colorado.edu/~epperson/courses/routing-protocols/handouts/bgp_scalability_IETF.ppt - Jared
Current thread:
- Soliciting your opinions on Internet routing: A survey on BGP convergence Laurent Vanbever (Jan 09)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Baldur Norddahl (Jan 09)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Job Snijders (Jan 10)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Hugo Slabbert (Jan 10)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Jared Mauch (Jan 10)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Mike Jones (Jan 10)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Job Snijders (Jan 10)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Laurent Vanbever (Jan 10)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Baldur Norddahl (Jan 09)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence joel jaeggli (Jan 09)
- Re: Soliciting your opinions on Internet routing: A survey on BGP convergence Laurent Vanbever (Jan 10)
- <Possible follow-ups>
- RE: Soliciting your opinions on Internet routing: A survey on BGP convergence Jakob Heitz (jheitz) (Jan 10)
- RE: Soliciting your opinions on Internet routing: A survey on BGP convergence Jakob Heitz (jheitz) (Jan 11)