Must-Know Metrics for Data Center Energy Management

by | Apr 3, 2012

This article is included in these additional categories:

The focus on energy consumption in data centers is requiring facility and IT managers to begin using some key metrics for monitoring and managing power. The most common means today for assessing data center energy efficiency is the Power Usage Effectiveness (PUE) rating system. PUE is simply the ratio of total power or energy consumed (the numerator) and the power or energy used by the IT equipment (the denominator). Version 2 of the rating system (PUE 2) permits measuring either power (in kilowatts) or energy (in kilowatt-hours), with a strong preference for the latter. Another common rating system is Data Center Infrastructure Efficiency (DCiE), which is the reciprocal of PUE: the ratio of the power used by the IT equipment to the total power consumed.

The typical data center today achieves a PUE rating of about 2.0; in other words, only half of the total energy consumed is being used by the IT equipment (servers, storage and networking infrastructure), with the other half going to the cooling, lights and the inherent inefficiencies in power distribution systems. The EPA has set a target for data centers in the US of a PUE rating between 1.1 and 1.4 (or a DCiE of 0.9 to 0.7). Achieving the EPA’s energy efficiency target will likely require a range of initiatives in most data centers, including:

  • Right-sizing the UPS and power distribution equipment to minimize inefficiencies, including eliminating unnecessary AC/DC conversion and adding co-generation capabilities
  • Eliminating cooling inefficiencies, upgrading the Computer Room A/C system to allow for variable cooling and/or making greater use of outside air for cooling
  • Adopting a hot/cold aisle configuration (which will involve optimizing the placement of servers in racks and rows to balance heat generation) and increasing cold aisle server inlet temperatures to 80.6°F (27°C) as recommended by the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE)
  • Consolidating and virtualizing servers to improve overall utilization
  • Refreshing IT equipment to newer, more power-efficient systems

Making these last two improvements can, however, produce a rather surprising and seemingly counter-productive result: an increase in PUE. The reason is that neither PUE nor DCiE takes into account the energy efficiency of the IT equipment itself. So an improvement in IT asset efficiency (the denominator in the PUE equation) can actually cause PUE to increase because less power is now going to the servers, while the overhead may remain largely the same. But these worthy efforts should not be punished with such “false negative” ratings.

To remedy this shortcoming, both McKinsey and Gartner have created rating systems that take into account IT energy efficiency: Corporate Average Datacenter Efficiency (CADE) and Power to Performance Effectiveness (PPE), respectively. Both systems are designed to help IT managers address the primary source of waste in data centers today: underutilized servers. CADE, for example, is the product of facility efficiency (in effect, DCiE) and IT asset efficiency, the latter being derived from a combination of the utilization and efficiency of the IT equipment. Organizations that have refreshed, consolidated and/or virtualized servers invariably achieve a significant improvement in the CADE rating, with PUE and DCiE getting either better or worse depending on the circumstances.

With average server utilization in the US around 10 percent or less for dedicated servers, and in the range of 20  percent to 30  percent for most virtualized environments, one way to improve CADE and PPE ratings is to utilize more energy-efficient servers. Organizations routinely refresh aging servers to take advantage of the improved performance made possible according to Moore’s Law. But determining the optimal time to refresh which servers can be a challenge. Newer servers inevitably offer superior price/performance, but can their total cost of ownership (including power consumption as a major operating expense) be justified? And if so, which old servers should be replaced first, and by which model(s) of new servers?

To help IT managers make such choices, the EPA created an EnergyStar Rating System for servers and other IT equipment. But EnergyStar has a fundamental flaw similar to the one with the PUE and DCiE rating systems: It focuses on the power supply, ignoring the transactional efficiency of the server itself, and it does not factor in the year/age of the equipment.

To address this shortcoming, Underwriters Laboratories (UL) created a new performance standard based on the PAR4 Efficiency Rating. PAR4 provides an accurate method for determining both absolute and normalized (over time) energy efficiency of both new and existing equipment on a transactions per second per watt basis. To calculate server performance using the UL2640 standard, a series of standardized tests is performed, including a Power On Spike Test, a Boot Cycle Test and a Benchmark Test. These test results can then be used to determine idle and peak power consumption, along with transactions/second/watt and other useful annualized ratings. Together these metrics provide a very accurate means for IT managers to compare legacy servers with newer ones, and newer servers with one another.

It is important to note that to improve energy efficiency ratings, vendors are further reducing the idle power IT equipment consumes. This has served to make the spread between idle and loaded power consumption much wider, and will have the effect of causing more power spikes in data centers during periods of high application demand.

To ensure getting meaningful measurements, regardless of the metric(s) used, it is important to collect data often. Frequent measurements are necessary to capture changes in power utilization at different times of year, different times of the day or an hour, and most importantly, during periods of peak and low demand. Fortunately the more traditional measurements like power and temperature are now provided by newer servers directly. It is also important for all aspects of power monitoring and management to be pervasive across the IT and facilities organizations, and to be consistent among multiple data centers. To achieve the best results, these ongoing efforts will require an unprecedented level of cooperation between IT and facility managers.

Clemens Pfeiffer is the CTO of Power Assure and is a 25-year veteran of the software industry, where he has held leadership roles in process modeling and automation, software architecture and database design, and data center management and optimization technologies.

Additional articles you will be interested in.

Stay Informed

Get E+E Leader Articles delivered via Newsletter right to your inbox!

This field is for validation purposes and should be left unchanged.
Share This