3 Reasons Why You Need Data Center Health Monitoring for Intelligent PDUs
September 26, 2018

When you consider key use cases for data center power management, one of the first things you might think of is energy efficiency. After all, the more you can control your data center energy consumption, the more money you can save, the better you can utilize your existing resources, and the more strictly you can adhere to industry guidelines for safe environments for your IT equipment or for green data center initiatives. That’s why data center monitoring combined with the ability to set thresholds and send traps, alerts, and notifications is so useful for today’s data center managers.

But what if your intelligent PDUs can’t send traps? What if there is a network or power outage? These are two of the leading causes of data center outages in 2018 and can happen at any time.

Without connectivity to the devices in your data center, understanding the status of your intelligent PDUs is difficult. Tools like Data Center Infrastructure Management (DCIM) software can help you address situations like these through health monitoring of the intelligent PDUs and other devices in your data center.

What Are Different Types of Data Center Health Monitoring?

DCIM software monitors the health and status of the intelligent PDUs in your data center by polling them using a protocol like SNMP or ICMP and collecting data from them.

Most DCIM tools support standard data polling, in which you can set a polling interval (such as five minutes), and your data center software will poll the intelligent PDUs at the desired interval and then collect and store the data. Intelligent PDUs that have memory in the physical unit may be able to store data on the unit itself, so that your DCIM software can poll the iPDUs and collect the data less often for less frequent polling and more efficient network traffic. This feature can also be useful if there is an outage or if you lose connectivity to an iPDU, as you can collect the data later if necessary.

A comprehensive DCIM solution will also have health polling in addition to standard data polling. Health polling can allow you to get information on PDU health more quickly than with just standard data polling alone by pinging your PDUs at more frequent intervals to ensure they are network-reachable. DCIM software can use this information to update an intelligent PDU’s health and to create events when the status of a PDU changes.

Why Do You Need Data Center Health Polling?

When combined with DCIM software, health polling enables you to check for the following:

  1. Power to your intelligent PDUs. Health polling can help you test that your intelligent PDUs are powered. If there’s no power to the PDU, it won’t respond when polled.
  2. Network connectivity. Health polling can help you determine if there is an outage by checking that PDUs are reachable on the network.
  3. PDU communication. Health polling can help you ensure that the communication module on the intelligent PDU is functioning correctly by responding to the ping.

For example, say you have 500 racks in your data center, with each rack having two iPDUs set up for redundancy. If you lost one of those PDUs, you may not lose power and trigger an alarm despite losing redundancy. With health polling, your DCIM software could ping every iPDU in your data center at one-minute intervals. You can then be alerted that the PDU is unavailable and take immediate action to address the situation.

Data center monitoring practices that use both data polling and health polling can be effective for keeping an eye on the intelligent PDUs in your environment. It simplifies data center power monitoring by automatically checking on the power, network connectivity, and communication to your iPDUs so you’ll quickly be alerted to issues before they become problems.

The next time you are configuring the polling interval in your DCIM software, consider how this belt-and-suspenders approach could provide an additional layer of protection for the health of your data center and ultimately help you ensure uptime and availability.  

