nanog mailing list archives
Re: "Hypothetical" Datacenter Overheating
From: Warren Kumari <warren () kumari net>
Date: Tue, 16 Jan 2024 08:37:09 -0800
On Mon, Jan 15, 2024 at 9:55 AM, William Herrin <bill () herrin us> wrote:
On Mon, Jan 15, 2024 at 6:08 AM Mike Hammett <nanog () ics-il net> wrote: Let's say that hypothetically, a datacenter you're in had a cooling failure and escalated to an average of 120 degrees before mitigations started having an effect. What should be expected in the aftermath? Hi Mike, A decade or so ago I maintained a computer room with a single air conditioner because the boss wouldn't go for n+1. It failed in exactly this manner several times.
And in the early 2000s I worked at a (very crappy) ISP/Colo provider which had their primary locations in a small, brick garage. It *did* have redundant AC — in the form of two large window units, stuck into a hole which had been hacked through the brick wall. They were redundant — there were two of them, and they were on separate circuits. What more could you ask for?! At 2AM one morning I'm awakened from my slumber by a warning page from the monitoring system (Whatsup Gold. Remember Whatsup Gold?) letting me know that the temperature is out of range. This is a fairly common occurrences, so I ack it and go back to sleep. A short while later I'm awakened again, and this time it's a critical alert and the temperature is really high. So, I grumble, get dressed, and drive over to the location. I open the door, and, yes, it really *is* hot. This is because the AC units have been vibrating over the years, and the entire row of bricks above have popped out. There is now an even larger hole in the wall, and both AC units are lying outside, still running. 'Twas not a good day…. W After the overheat was detected by the monitoring system, it would be
brought under control with a combination of spot cooler and powering down to a minimal configuration. But of course it takes time to get people there and set up the mitigations, during which the heat continues to rise. The main thing I noticed was a modest uptick in spinning drive failures for the couple months that followed. If there was any other consequence it was at a rate where I'd have had to be carefully measuring before and after to detect it. Regards, Bill Herrin -- William Herrin bill () herrin us https://bill.herrin.us/
Current thread:
- Re: "Hypothetical" Datacenter Overheating, (continued)
- Re: "Hypothetical" Datacenter Overheating Saku Ytti (Jan 15)
- Re: "Hypothetical" Datacenter Overheating William Herrin (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Saku Ytti (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Nathan Ward via NANOG (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Saku Ytti (Jan 16)
- RE: "Hypothetical" Datacenter Overheating Robert Mercier (Jan 16)
- Re: "Hypothetical" Datacenter Overheating bzs (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Saku Ytti (Jan 17)
- Re: "Hypothetical" Datacenter Overheating sronan (Jan 15)
- Re: "Hypothetical" Datacenter Overheating Warren Kumari (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Izaac (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Chris Adams (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Shawn L via NANOG (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Jay Hennigan (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Karl Auer (Jan 16)
- Re: "Hypothetical" Datacenter Overheating bzs (Jan 16)
- Re: "Hypothetical" Datacenter Overheating Mike Hammett (Jan 15)
- Re: "Hypothetical" Datacenter Overheating sronan (Jan 15)