This post is the second in a series that expands on the Top Trends in Data Center Energy Management. The first post covered The Must-Know Metrics for Data Center Energy Management. This installment explores how to put these metrics to use in data centers that have been consolidated and virtualized.
As noted in the previous article, refreshing IT equipment to newer, more power-efficient systems, and consolidating and virtualizing servers to improve overall utilization can produce counter-productive results with the Power Usage Effectiveness (PUE) rating system. The reason is that a reduction in the energy consumed by IT assets (the denominator in the PUE equation) causes PUE to increase when the power required for cooling, lights and the inherent inefficiencies in power distribution systems remains largely the same. For this reason, it is better to use a metric like McKinsey’s Corporate Average Datacenter Efficiency in virtualized data centers because CADE takes into account IT asset efficiency. Organizations that have refreshed, consolidated and/or virtualized servers invariably achieve a significant improvement in the CADE rating, with PUE getting either better or worse depending on the circumstances.
Consolidating and virtualizing servers within and among data centers is a proven technique for improving overall IT asset efficiency owing to a combination of greater economies of scale, easier management, and higher utilization (typically 10 percent for dedicated servers vs. 30 percent or more for virtualized servers). An increasingly important aspect of these efforts is the energy efficiency of the individual servers themselves, and the most efficient servers are the ones with the highest number of transactions per second per Watt (TPS/Watt). As also noted in the previous article, the PAR4 Efficiency Rating system used in the Underwriters Laboratories’ UL2640 standard is the most accurate means for IT managers to compare the transactional efficiency of legacy servers with newer ones, and newer models of servers with one another.
In addition to a potentially significant improvement in IT asset efficiency with higher utilization of virtualized high-TPS/Watt servers, virtualization has another advantage when the infrastructure is load-balanced between or among multiple data centers. Spreading resources across two or more facilities satisfies disaster recovery needs while minimizing the risk of outgrowing any single data center. With every application able to run in either/any facility (something relatively easy to achieve in a virtualized or load-balanced environment), the workload can be shifted as needed to maintain service levels during normal operation, as well as to minimize disruptions during outages and scheduled maintenance (albeit at a potentially lower service level depending on the circumstances). A growing number of organizations will take full advantage of this configuration by continuously load-balancing all critical applications between or among data centers, and/or grouping applications by service level priorities.
Even with highly power-efficient and highly utilized servers, data centers inevitably waste a substantial amount of energy. The reason is: Servers are deployed and configured for peak capacity, performance and reliability, usually at the expense of efficiency. Such an approach, while necessary from an individual application perspective, can unnecessarily increase capital and operational expenditures, and can result in finite resources (particularly power and space) being exhausted, thereby creating a situation where the organization might outgrow its data center(s). The ability to reduce both CapEx and OpEx has motivated organizations to consolidate at least some of their servers onto virtualized environments, and those with particularly aggressive efforts have achieved impressive results. AOL, for example, recently reported annual savings of $5 million from “decommissioning” about one-fourth of its servers worldwide, including $2.2 million in operating system licenses and $1.65 million in energy bills.
To achieve the greatest possible reduction in wasted energy, however, it is necessary to minimize the power consumed by mostly idle servers during periods of low application demand. Indeed, total server power consumption can be reduced by up to 50 percent by matching online capacity (measured in cluster size) to actual load in real-time. Runbooks can be used to automate the steps involved in resizing clusters and/or de-/re-activating servers, whether on a predetermined schedule or dynamically in response to changing loads. These dynamic “stretchable” cluster configurations are the most energy efficient way to support variable application demand as the hardware currently running can be utilized at 70 to 80 percent, with capacity being added and removed as necessary.
The savings from matching server capacity to actual demand are not trivial. Both the US Department of Energy and Gartner have observed that the cost to power a typical server over its useful life can now exceed the original capital expenditure. Gartner also notes that it can cost over $50,000 annually to power a single rack of servers. So reducing the power consumed while servers are mostly idle or clusters are lightly utilized holds the potential to deliver significant savings while continuing to satisfy application performance objectives. Such dynamic management across multiple data centers can also increase application capacity beyond the original cluster allocation to better accommodate unforeseeable spikes in demand, thereby increasing both the performance and availability of critical applications.
Clemens Pfeiffer is the CTO of Power Assure and is a 25-year veteran of the software industry, where he has held leadership roles in process modeling and automation, software architecture and database design, and data center management and optimization technologies.