nanog mailing list archives

Re: bfd-like mechanism for LANPHY connections between providers


From: Jeff Wheeler <jsw () inconcepts biz>
Date: Wed, 16 Mar 2011 18:59:35 -0400

On Wed, Mar 16, 2011 at 4:42 PM, Jensen Tyler <JTyler () fiberutilities com> wrote:
Correct me if I am wrong but to detect a failure by default BGP would wait the "hold-timer" then declare a peer dead 
and converge.

So you would be looking at 90 seconds(juniper default?) + CPU bound convergence time to recover? Am I thinking about 
this right?

This is correct.  Note that 90 seconds isn't just a "Juniper default."
 This suggested value appeared in RFC 1267 §5.4 (BGP-3) all the way
back in 1991.

In my view, configuring BFD for eBGP sessions is risking increased
MTBF for rare reductions in MTTR.

This is a risk / reward decision that IMO is still leaning towards
"lots of risk" for "little reward."  I'll change my mind about this
when BFD works on most boxes and is part of the standard provisioning
procedure for more networks.  It has already been pointed out that
this is not true today.

If your eBGP sessions are failing so frequently that you are very
concerned about this 90 seconds, I suggest you won't reduce your
operational headaches or customer grief by configuring BFD.  This is
probably an indication that you need to:
1) straighten out the problems with your switching network or transport vendor
2) get better transit
3) depeer some peers who can't maintain a stable connection to you; or
4) sacrifice something to the backhoe deity

Again, in the case of an IXP interface, I believe BFD has much more
potential benefit.

-- 
Jeff S Wheeler <jsw () inconcepts biz>
Sr Network Operator  /  Innovative Network Concepts


Current thread: