nanog mailing list archives
Re: Data Center testing
From: Warren Kumari <warren () kumari net>
Date: Wed, 26 Aug 2009 17:39:57 -0400
On Aug 24, 2009, at 9:38 AM, Dan Snyder wrote:
We have done power tests before and had no problem. I guess I am looking for someone who does testing of the network equipment outside of just power tests. We had an outage due to a configuration mistake that became apparentwhen a switch failed.
So, one of the better ways to make sure that your failover system is working when you need it is just to do away with the concept of a failover system and make your "failover" system be part of your "primary" system
.This means that your failover system is always passing traffic and you know that it is alive and well -- it also helps mitigate the pain when a device fails (you are sharing the load over both systems and so only half as much traffic gets disrupted). Scheduled maintenance is also simpler and less stressful as you already know that your other path is alive and well.
Your design and use case dictates how exactly you implement this, but in general it involves things like tuning your IGP so you are using all your links, staggering VLANs if you rely on them, multiple VRRP groups per subnet, etc.
This does require a tiny bit more planning during the design phase, and also requires that you check every now and then to make sure that you are actually using both devices (and didn't, for example, shift traffic to one device and then forget to shift it back :-)). It also requires that you keep capacity issues in mind -- in a primary and failover scenario you might be able to run devices fairly close to capacity, but if you are sharing the load you need to keep things under 50% (so when you *do* have a failure the remaining device can handle the full load) -it's important to make this clear to the finance folks before going down this path :-)
W
It didn't cause a problem however when we did a power test for the whole data center. -DanOn Mon, Aug 24, 2009 at 9:31 AM, Ken Gilmour <ken.gilmour () gmail com> wrote:I know Peer1 in vancouver reguarly send out notifications of "non-impacting" generator load testing, like monthly. Also InterXion in Dublin, Ireland have occasionally sent me notification that there was a power outage of less than a minute however their backup successfully took the load. I only remember one complete outage in Peer1 a few years ago... Never seen any outage in InterXion Dublin. Also I don't ever remember any power failure at AiNet (Deepak will probably elaborate) 2009/8/24 Dan Snyder <sliplever () gmail com>:Does any one know of any data centers that do failure testing of theirnetworking equipmentregularly? I mean to verify that everything fails over properly afterchanges have been made over time. Is there any best practice guides for doing this? Thanks, Dan
--"Does Emacs have the Buddha nature? Why not? It has bloody well everything else!"
Current thread:
- Re: Data Center testing, (continued)
- Re: Data Center testing eric clark (Aug 25)
- Re: Data Center testing Jeff Aitken (Aug 25)
- RE: Data Center testing Frank Bulk - iName.com (Aug 25)
- Re: Data Center testing Jeff Aitken (Aug 26)
- Re: Data Center testing James Hess (Aug 25)
- Re: Data Center testing Jack Bates (Aug 26)
- Re: Data Center testing Ross Vandegrift (Aug 26)
- RE: Data Center testing Dylan Ebner (Aug 26)
- RE: Data Center testing Deepak Jain (Aug 26)
- Re: Data Center testing Matthew Palmer (Aug 27)
- Re: Data Center testing Warren Kumari (Aug 26)
- Re: Data Center testing Seth Mattinen (Aug 24)
- RE: Data Center testing Deepak Jain (Aug 24)