nanog mailing list archives

Re: Better description of what happened


From: Bjørn Mork <bjorn () mork no>
Date: Wed, 06 Oct 2021 18:21:26 +0200

Tom Beecher <beecher () beecher cc> writes:

 Even if the external
announcements were not withdrawn, and the edge DNS servers could provide
stale answers, the IPs those answers provided wouldn't have actually been
reachable

Do we actually know this wrt the tools referred to in "the total loss of
DNS broke many of the tools we’d normally use to investigate and resolve
outages like this."?  Those tools aren't necessarily located in any of
the remote data centers, and some of them might even refer to resources
outside the facebook network.

Not to mention that keeping the DNS service up would have prevented
resolver overload in the rest of the world.

Besides, the disconnected frontend servers are probably configured to
display a "we have a slight technical issue. will be right back" notice
in such situations.  This is a much better user experience that the
"facebook?  never heard of it" message we got on monday.

yes, it makes sense to keep your domains alive even if your network
isn't.  That's why the best practice is name servers in more than one
AS.




Bjørn


Current thread: