nanog mailing list archives

Re: Nashville


From: Javier J <javier () advancedmachines us>
Date: Wed, 13 Jan 2021 10:09:14 -0500

Is there a video of this? I would also love to see pictures of what the
damage was inside the building and repairs. Not sure if that was documented
anywhere. I would assume they are still doing repairs and upgrades to the
facility.

On Tue, Dec 29, 2020 at 8:18 PM Robert DeVita <radevita () mejeticks com>
wrote:

AT&T Disaster Recovery Team is probably the best in the business. The
resources they can bring to the table are unmatched. This would have been
100x worse if it hit a carrier neutral datacenter. They don’t have nearly
the same resources to restore something like this. They usually do a road
show (pre Covid). If you get a chance it’s definitely something you should
go check out. Very impressive.

Robert DeVita
Founder & CEO
Mejeticks
c. 469-441-8864
e. radevita () mejeticks com
------------------------------
*From:* NANOG <nanog-bounces+radevita=mejeticks.com () nanog org> on behalf
of Eric Kuhnke <eric.kuhnke () gmail com>
*Sent:* Tuesday, December 29, 2020 5:06:00 PM
*To:* Sean Donelan <sean () donelan com>
*Cc:* NANOG <nanog () nanog org>
*Subject:* Re: Nashville

From a few days ago. Obviously centralizing lots of ss7/pstn stuff all in
one place has a long recovery time when it's physically damaged. Something
to think about for entities that own and operate traditional telco COs and
their plans for disaster recovery.


Nv1

Here is the latest update:  6:46AM 12/27:

Work continues restoring service to the CRS routers in the Nashville
Central Office. One router remains out of service and the other is in
service with some links remaining out of service.

The working bridge will reconvene at 08:00 CT with the following action
plan:
Additional cabling added to the first portable generator to enable full
load capabilities (08:00 CT)
Pigtails with camlocks installed for easy swap; investigate possibility to
land generator on the emergency service board to give the site N+1 with a
manual ability to choose anyone. (08:00 CT)
check small power plants on floors 4 and 6 (08:00 CT)
Investigate water damage on 1st floor and energize if safe (08:00 CT
Air handlers for floors 4,5 and 6 (09:00 CT)
complete all transport work
Turn up SS7
Turn up 911 service - Approximately noon or after)
Turn up switching service.
TDM Switching team will reconvene at 09:00 CT and the Signaling team will
reconvene at 11:00 CT on 12/27/2020.
DMS equipment on the 1st floor will be assessed for water damage.
Switching teams will monitor power and HVAC restoration and will begin
switch restoration as soon as the go ahead is provided by the power team.

Recovery Priorities:
1. 4th & 5th floors (Specify transport equipment needed to clear MTSO SS7
isolation & Datakit needed for Local Switch restoration). Transport SMEs
currently working to turn up transport equipment
2. 6th floor (ESINET Groomers)
3. 10th and 8th floors (N4E) – Trunks
4. 1st floor (DMS: DS1, 5E: DS3) - Local POTS
5. 1st floor (DMS: DS0, DS2 | 5E: DS6) – Trunks
6. 11th floor (DMS: 01T) – Trunks
7. 4th floor (STP and SCP with mates up in Donelson)

The next update will be issued at approximately 09:00 CT on December 27.



Nv2

As of 09:00 CT: Teams worked through the night to restore service and
improve conditions at the Nashville 2nd Ave Central Office. Since the
initial service impact, over 75% of the Out of Service Mobility Sites have
been restored. Certain call flows may be limited and should improve as
additional restoration activities complete.
The generator that is currently powering equipment on the 2nd and 3rd
floor, was refueled and ran with no issues through the night. Overnight,
the batteries connected to it, continued to charge. Teams have placed
additional power cables, which once connected, will allow the working
generator, to better handle the load in the building. In order to
accomplish this, the generator will need to be shut down for 15-30 minutes
this morning, so teams can connect the new cables to the system. The power
team reports they are still on target to restore power and cooling to the
5th and 6th floor by approximately 12:00 CT. Also, a portable chiller will
be delivered this morning and strategically placed, in case it is needed to
assist in cooling the office.
There is a Call Center at 333 Commerce, in Nashville that does not have
network or phone services available. Corporate Real Estate (CRE) reports
there is some damage to that office, but the extent of the damage will not
be known until they can gain access to the site. Because of this, the
impacted Call Center ceased operations until further notice.
DMS switching equipment on the 1st floor will be assessed for water
damage. Switching teams will monitor power and HVAC restoration. Equipment
power ups will begin, as soon as the go ahead is provided by the power
team.
Two SatCOLTs remain positioned on the East and West sides of the NSVLTNMT
Central Office providing critical communication for teams working
restoration efforts. There are 17 assets deployed in the field- 15 are on
air (the 2 at the CO and 13 supporting FN Customer Requests) and 2 are in
hot-standby for FN Customers where macro service recently recovered. There
is 1 asset staged at a deployment site in KY where macro service restored,
and 8 additional assets are on route to Nashville today to fulfill pending
FN Customer requests. Incoming requests continue to be triaged. The ones in
areas where service looks to have been restored, are being held, while the
others are being prioritized to be dispatched upon.

The next update will be issued at approximately 14:00 CT, unless there is
a significant change in status.



Nv3

AT&T Nashville update below, received at 3:35PM 12/27.

Since the initial service impact, over 95% of the Out of Service Mobility
Sites have been restored. Certain call flows may be limited and should
improve as additional restoration activities complete.

Electricians have installed the additional power cables from the
generator, to the emergency bus. These new cables will allow the generator
to support more of the load, of the building. The portable chiller
requested, has arrived on-site, and is available to assist in cooling, if
needed. Generally speaking, there are four (4) phases of restoration per
floor (Air Handler restoral, Power restoral, Transport Equipment restoral,
and Switch/Application Equipment restoral). Teams report that Air Handlers
are up and running, and all power plants are on floors 2 through 7 are
online. Given significant progress made, floors 2 through 7, are ready for
technology turn up. Relative to Priority Transport related equipment,
approximately 90% of the elements have been turned up on floors 2 through
7. The Power team is currently working on Floors 8 through 11 (N4E). The
first floor is not accessible, at this time. Once access is granted by
federal and local authorities, further assessment and restoration efforts
will begin.

The generator is currently supporting approximately 50% of its capacity,
and alternative plans are being considered to handle the full load of the
building. Teams continue to work proactively in effort to identify
potential issues and are actively engaged working to restore services and
repair infrastructure.

AT&T Network Disaster Recovery (NDR) has eleven (11) SatCOLTs in service
(TN, AL, GA). Two (2) of the eleven (11) are deployed at the Nashville, TN
Central Office to provide coverage for the AT&T response teams as well as
FirstNet (FN) customers. One (1) COLT is in hot-standby (TN). Six (6) COLTs
are en-route to deployment sites in TN and AL. Three (3) COLTs are being
demobilized in Alabama and coming back to Nashville for new assignments and
five (5) additional COLTs are en-route to the Nashville area to support
additional requests.

The next update will be issued at approximately 19:00 CT, unless there is
a significant change in status.



On Mon, Dec 28, 2020, 5:59 PM Sean Donelan < sean () donelan com > wrote:


AT&T statement says nearly all services have been restore in Nashville as
of Monday, 5pm CST

They are working on permanent repairs.

https://about.att.com/pages/disaster_relief/nashville.html
<https://mila.bitdam.com/api/v1.0/links/rewrite_click/?rewrite_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJyZXdyaXRlX2lkIjoiNWZlYmI3MGZiOGZhNWRhYjg5ZjE0NzJiIiwidXJsIjoiIiwib3JnYW5pemF0aW9uX2lkIjoxNzgxfQ.fQSGLDVfXTrVFYepbx0AR_uG_qKj-UG337RhC567WBI&url=https%3A//about.att.com/pages/disaster_relief/nashville.html>


AT&T's Network Disaster Recovery group faces management questions nearly
every year to justifying their budget. While no one wants disasters,
business continuity has to be part of the business.  There are also mutual
aid agreements between companies, but I don't know how many were invoked
for this incident.

https://about.att.com/ecms/dam/pages/disaster_relief/NDR_edited_04.22.19.pdf

<https://mila.bitdam.com/api/v1.0/links/rewrite_click/?rewrite_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJyZXdyaXRlX2lkIjoiNWZlYmI3MGZlZTIwYTJmMzM5ZGM5ZTgwIiwidXJsIjoiIiwib3JnYW5pemF0aW9uX2lkIjoxNzgxfQ.xIE6w200hxqs5MLmTyAw3TKHl50TG9NaxjGW32JTEr8&url=https%3A//about.att.com/ecms/dam/pages/disaster_relief/NDR_edited_04.22.19.pdf>



Current thread: