nanog mailing list archives

Re: What to expect after a cooling failure


From: Bryan Tong <contact () nullivex com>
Date: Tue, 9 Jul 2013 23:30:25 -0600

Honestly, I think your hardware will be fine just like everyone else said
keep an eye on your hard drives they are by far the most sensitive.
Anything not mechanical if it didnt melt you're good.

One data center we had equipment in was 153F for about a week and all we
saw were drive failures and they were still fairly sparse. 1 out of 10 I
would say.

Thanks


On Tue, Jul 9, 2013 at 11:07 PM, Jimmy Hess <mysidia () gmail com> wrote:

On 7/9/13, Erik Levinson <erik.levinson () uberflip com> wrote:
For those who have gone through such events in the past, what can one
expect
in terms of long-term impact...should we expect some premature component
failures? Does anyone have any stats to share?

Realistically...  you had a single short-lived stress event.    There
are likely to be some number of random component failures in the
future.   It is unlikely that you will be able to attribute the
failures to such a short lived stress event of that magnitude  --
there might on average be a small increase over normal failure rates.

The bigger concern,  may be that  /a lot of different components/
could have been subject to the same kind of abuse at the same time:
including  sets of components that are supposed to be in a redundant
pair  and not fail simultaneously.

I wouldn't necessarily be so concerned about premature failures ---
I would be more concerned,  that you  may have redundant components
that were exposed to the same stress event at the same time;    now
the assumption that   their chances of failure are independent  may
become more questionable   ---   the chance of a correlated failure in
the future  might be greatly increased,     reducing the level of
effective redundancy/risk reduction today.

That would apply mainly to mechanical devices such as HDDs.


Thanks
--
-JH




-- 
--------------------
Bryan Tong
Nullivex LLC | eSited LLC
(507) 298-1624


Current thread: