nanog mailing list archives

Re: Internet operations during pandemics


From: Warren Kumari <warren () kumari net>
Date: Thu, 19 Mar 2020 09:27:33 -0400

Many years ago (1990s) I worked for a startup in NYC. We had a
conference room called "Conference Room S", this was a semi-reserved
loft are in the Starbucks across the street, with an open tab[0].
The leads for each group had the project to design the DR / BCP plans
for the entire organization, and so we had a daily, 1 hour meeting in
Conf Room S.

Instead of actually working on the DR plan, we used the time to get
other work done - it was quiet, there was wifi, there were no
interruptions, there was free coffee...

After a few months the CTO asked us to finish up and give him the
plans... so, we wrote:
Step 1: Panic !!!
Step 2: Sell stock options (if any...)[1]
Step 3: Post resume on Monster.com
Step 4: ...

We printed up a bunch of copies of this, put it in an envelope,
labeled it as "DR plan - open in case of disaster" and gave it to the
CTO -- we fully expected him to open it, shout at us for a bit and /
or chuckle resignedly, and then demand we actually do something useful
- but, instead, something much much worse occurred... he thanked us,
and locked it, unopened, in his filing cabinet. As blinked and asked
him if he was going to read it, and he said "No, I trust you to have
done a good job..."

We felt *really* bad, and worked late over the next few weeks and
weekends to actually make a good BCP/DR plan, and then confessed our
sins. We also ran table-tops, distributed and tested the plans, etc..

I've always wondered whether the CTO somehow knew what we'd been up to
- because of our guilt, the quality / comprehensiveness of the plan
ended up much better than it would have otherwise...


W
[0]: We thought that we were super cool for this...
[1]: This was an ongoing joke - the company was always "almost ready"
to go public...

On Thu, Mar 19, 2020 at 8:35 AM Andrew Latham <lathama () gmail com> wrote:

In my past it has always benefited me to set use cases and plan accordingly. For many it is difficult to imagine 
these less than awesome use cases. Having working to get datacentres back online in 8.8 earthquakes and dealing with 
fires in co-location sites it is hard.

1. Document
2. Generate use cases, DR plans, OOBM, document peers phone numbers offline
3. Implement, share, discuss
4. Profit

While this sounds ideal and simple it is not a small effort. I have two talks I must finish up where on is on 
*Organizations as Code* and how to survive the worst.



On Wed, Mar 18, 2020 at 5:25 PM Christopher Morrow <morrowc.lists () gmail com> wrote:

Did other folk on nanog-l see the nLnog-l note copied here?
I wonder how folk are planning for things (noted in the slides)
  o  supply chain for parts/equipment
     Wait, I can't get me a new shiny shipped because what??

  o ongoing rollout of new equipment
     I'm deploying next week in KIX, I'm currently in LAX how do I get
there? equipment arrives.. in between...oops!

  o noc/etc support staff
    omg.. wait, I can't have my noc staff in the same room? our 'wfh'
solution is ... wait, where is that?
    how do i get their phone queue sent to them? omg :( <sadness!>

  o services capacity crunches
    I love my shiny new dns service.. .wait, why is there a smoking
hole where my dns servers were?

I think some of this has been discussed (shifts in peaks, leveling of peaks)
Some hasn't really...  I expect that at least sharing some 'err, our
WFH changed now we do: X, Y , Z and use M to get N solved'
could be super cool to discuss/share and iterate for better solutions
for all of our users.

thoughts? :)

thanks!
-chris
(note all the hard work in this message is not mine... thanks Job!)

---------- Forwarded message ---------
From: Job Snijders <job () ntt net>
Date: Wed, Mar 18, 2020 at 6:02 PM
Subject: Internet operations during pandemics
To: <nlnog () nlnog net>


Dear all,

I threw together a slidedeck today on the potential impact and second
order effects of COVID-19 on Internet network operations.

    http://instituut.net/~job/netops_during_pandemics.pdf

I hope we together over time can add and extend projections in the deck
on what will happen and how we can mitigate the negative effects on
Internet operations.

We have to answer questions such as:

    1) what problems already exist today because of a few weeks of C19?
    2) What problems are still coming? Will those be localized or globally?
    3) What possible workarounds can we plan for those problems?

I would appreciate feedback, comments, corrections or whatever you want
to tell me. None of us have been in this situation before, so my guess
is as good as yours.

Kind regards,

Job



--
- Andrew "lathama" Latham -



-- 
I don't think the execution is relevant when it was obviously a bad
idea in the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair
of pants.
   ---maf


Current thread: