nanog mailing list archives

Re: Soliciting your opinions on Internet routing: A survey on BGP convergence


From: Laurent Vanbever <lvanbever () ethz ch>
Date: Tue, 10 Jan 2017 08:00:06 +0100

Dear Baldur,

I find that the type of outage that affects our network the most is neither of the two options you describe. As is 
probably typical for smaller networks, we do not have redundant uplinks to all of our transits. If a transit link 
goes, for example because we had to reboot a router, traffic is supposed to reroute to the remaining transit links. 
Internally our network handles this fairly fast for egress traffic.

However the problem is the ingress traffic - it can be 5 to 15 minutes before everything has settled down. This is 
the time before everyone else on the internet has processed that they will have to switch to your alternate transit.

Thanks a lot for your input. Indeed, that case is a bit special. I’d say it is a kind of remote outage that remote ASes 
experience towards your prefix and, as such, requires a "BGP-only” convergence. I guess if your prefixes going via 
alternate transit are not visible at all prior to the switch (and I guess not), this is a kind of “extreme” convergence 
where routes have to be withdrawn/updated Internet-wide. This reminds me of the paper by Craig Labovitz et al. 
(http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-5-2.pdf 
<http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-5-2.pdf>) which I think classify these events as 
Tlong ("An active route with a short ASPath is implicitly replaced with a new route possessing a longer ASPath. This 
represents both a route failure and failover”). And indeed, these are the second slowest just before the withdraw of a 
prefix Internet-wide.

You’re right that our survey targets more the case in which large bursts of UPDATEs/WITHDRAWs are exchanged. I guess a 
parallel case to the one you mention could be that your prime transit performs a planned maintenance (or experiences a 
failure) that triggers the sending of WITHDRAWs for your prefixes out.

The only solution I know of is to have redundant links to all transits. Going forward I will make sure we have this 
because it is a huge disadvantage not being able to take a router out of service without causing downtime for all 
users. Not to mention that a router crash or link failure that should have taken seconds at most to reroute, but 
instead causes at least 5 minutes of unstable internet.

Maybe you could advertise better routes (i.e., with shorter AS-PATHs/longer prefixes) via the alternate transit prior 
to the take down? Ideally, if you could somehow make your primary transit switch to use an alternate transit prior to 
the maintenance (maybe with a special community?), you could completely avoid a disruption. This would go into the 
direction of minimizing the amount of WITHDRAWs in favor of UPDATEs. But, of course, this would only work in the case 
of planned maintenance.

We would definitely welcome more input on the convergence issue you face!

Best,
Laurent

Current thread: