Data center disasters come in all shapes and sizes. Unexpected weather issues, temperature spikes and cooling failures, power outages, and equipment problems are all potential issues that keep data center managers awake at night.
Want to rest assured that you’re mitigating the risks of unplanned downtime, avoid hundreds of thousands of dollars lost per outage, and preserve your data center’s health? Learn how simple it can be to plan smarter and reduce the impact of different disasters on your data center with Data Center Infrastructure Management (DCIM) software.
Here are just a few examples of data center disasters and how DCIM could help to mitigate the risks or aid in the disaster recovery efforts:
In 2013, Microsoft had a failed software update, which caused heat to spike in one part of the data center that supported those services. The temperature rose so swiftly that the original prevention plan of automated failover processes couldn’t be implemented. As a result, Hotmail and Outlook services were offline for up to 16 hours. To mitigate a disaster like this with DCIM, you could use automated, real-time notifications for threshold violations that enable you to immediately identify and predict hotspots and potential trouble areas.
In 2010, Wikipedia’s disaster prevention plan for lack of cooling was a backup data center with failover procedure. The company had a cooling issue, which led to heat conditions that caused servers to shut down. After Wikipedia switched from its European data centers to its Tampa data center, failover failed, leading to its site going down. To mitigate disasters like these, DCIM allows environmental monitoring through patented cooling charts that help you keep cabinets in manufacturer-recommended or ASHRAE® allowable environmental ranges.
In 2012, Hurricane Sandy swept the United States East Coast. During the hurricane, Cogeco Peer 1 had a disaster prevention plan of using backup generators. Unfortunately, the emergency generator fuel pumping system was knocked out, and the generators began to run out of their limited supply of fuel oil. The company had to conduct a planned shutdown and mobilize a bucket brigade to carry fuel for generators. DCIM could aid the responses to disasters like these through remote power control of outlets, IT devices, device groups, and racks and an agentless graceful operating system shutdown to safeguard your equipment.
In 2013, DreamHost’s disaster prevention plan for power outages centered emergency backup generators. During this year, the company experienced a power failure that lasted only a few minutes but led to issues in the network system that took hours to recover. The DreamHost power outage impacted more than 350,000 customers and 1.2 million blogs, websites, and apps. Additionally, the UPS system failed suddenly, the emergency backup generators failed to start properly, and a second power failure not long after led to intense periods of reboots, restores, and system checks. Following many hours of recovery, along with the loss of several critical pieces of networking hardware that did not survive the event, the company had to run on generators until it had the UPS issues fully resolved. For disasters like these, DCIM could help with reporting and planning tools that enable what-if analyses without impacting equipment in use to insure that you have enough capacity to continue operating in a failover situation.
Want to learn more? Check out the infographic below: