
Understanding what you know – and don’t know – about power usage in your data centre
Background
Raritan, a leading provider of data centre equipment, and PTS, a well-respected data centre consulting firm and turnkey solutions provider, recently conducted a series of tests to examine the effects of heat, airflow and power usage in Raritan’s working server environment which comprises 68 servers. The hypothesis was that by knowing more about their real-time operational environment, Raritan’s data centre could be managed smarter. Using intelligent power distribution units (iPDUs), among other devices, Raritan and PTS were able to monitor temperature and humidity, calculate airflow, as well as measure power for both the IT and supporting infrastructure load throughout Raritan’s server room.
Overview of The Problem
With the cost of power rising dramatically and increased uncertainty with regard to power availability (capacity and resilience), all levels of corporate management are now more focused than ever on managing and conserving energy.
Nowhere is this more critical than in the corporate data centre, which can consume 25% of the total energy in a typical IT intensive organization (Raritan estimate based on U.S. Environmental Protection Agency “Report to Congress on Server and Data Center Energy Efficiency Public Law 109-431”).
Year on year increases mean that data centres are running hotter and HVAC systems are working overtime to keep IT systems cool. This, in turn, is driving energy costs up – a growing concern for IT, given the potential for this expense to become an above-the-line IT charge.
Clearly, there’s a need to monitor data centre power and temperatures – and adjust heating, cooling and airflow – to minimize power consumption while maintaining IT equipment uptime. But where should forward-thinking corporations start? What tools are available to get the data needed to design a more efficient data centre?
This article takes a look at some of these thorny energy management issues and provides some relevant answers. You’ll learn the following three things:
The Challenge
When energy management was a low priority, IT managers could rely on simple calculations associated with nameplate specifications for power capacity planning in their data centres.
However, average data centre power consumption has grown from 2.1 kilowatts per rack in 1992 to 14 kWh/rack in 2006, according to HP (“HP Power & Cooling,” August 31, 2006). DatacenterDynamics, in their Spring 2007 Conference Survey, found that the average power density among all respondents in the U.S. was 5.5 kWh//rack and the average maximum power density was 11.6 kWh//rack.
Many in the industry are aware of significant power shortfalls in traditional locations. While some companies have flexibility to immediately move their data centres to locations that provide more reliable, less expensive power, most simply cannot. Which means that the shortfalls, left unresolved, are forcing IT and facilities managers to make hard decisions about which applications to support and which ones to sacrifice to keep the business running during peak demand.
As energy issues come under more scrutiny and as tools become available for accurate measurement, IT administrators and facilities managers should no longer rely on the published nameplate power ratings on their units and factor in accepted industry assumptions. While adequate in the past, “close enough” is no longer “good enough.”
With individual server measurement managers can accurately know what power their equipment is drawing and acquire precise numbers that will aid their energy efficiency planning efforts.
With the knowledge gained regarding server-by-server real-time monitoring, IT administrators can manage smarter and feel secure that they are making better decisions on what to power off because they will be able to:
Nameplate Ratings and Assumptions
IT equipment manufacturers indicate a power rating value for every device. Data centre administrators, however, know that this represents a worst-case scenario for a fixed set of conditions, and typical server power consumption may never reach the rated nameplate value.
The challenge is what value should be used to base planning decisions on regarding the capacity the server will use once deployed vs. the availability of power protection, distribution and cooling systems capacity in the data centre? Traditionally a lower percentage value to the manufacturers “name plate” values is used.
By measuring power consumption of individual servers using intelligent PDUs, analysis of Raritan’s IT equipment over a period of several days determined that consumption typically ranged from 20-85% of the underlying equipments’ nameplate rating, averaging about 31% – far less than the numbers typically used as the design rating of the equipment.
IT Equipment Power Load vs. Total Facility Power Load
An important measurement which has been receiving a lot of coverage recently is the power drawn by IT equipment in relation to the total power consumed by the facility. Quoted figures for industry averages range from 30% of total power usage in a data centre being used by servers and other IT equipment (APC-MGE) to 50% (EYP Mission Critical Facilities, a division of HP). Actual measurements by PTS and Raritan, however, found that 58% of the energy consumed in Raritan’s server room was for the critical IT load, with only 42% spent for the support systems. This high percentage of power going to the critical IT load would indicate that PTS and Raritan are doing a better job managing power in the data centre than is typical in the industry. These numbers illustrate the importance of getting accurate measurements in your data centre rather than relying on “industry averages.”
Measurement Systems
So how can you arrive at these measurements, efficiently and accurately? The following are some overall pieces of the puzzle to consider:
Branch Circuit Monitors and Individual Device Load Measurement
Branch circuit monitors are electrical devices that measure current load on all circuits on an electrical panel and alert operators when the load approaches the breaker’s rating.
Data Centre Environmental Aggregators
Environmental aggregators are specifically designed to gather relevant power and environmental data centre information. They can also consolidate the collected information and analyze it to help make informed decisions related to power consumption by both IT and facilities equipment.
Intelligent PDUs
Intelligent rack PDUs provide IT staff the ability to monitor the power consumption of any given server, storage unit or other IT device, which is helpful to identify those that are underutilized or identify those that are approaching or exceeding their normal high range of usage. There is also the ability to monitor and control the data centre power load at the facility level.
Intelligent rack PDUs can be controlled remotely via a Web browser or command line interface (CLI). They can meter power at both the PDU level and the individual outlet level; support alerts based on user-defined thresholds; provide security in the form of passwords, authentication, authorization and encryption and incorporate rich environmental management capabilities.
Still, these tools cannot reach their full potential if the modelling is based on just a static environment. In the real world, data centres are dynamic. Server utilization changes over time, which causes power consumption and heat generation/dissipation to change. This, in turn, requires appropriate cooling to specific racks or rows of computers.
An important step is to determine your data centre’s power usage effectiveness rating (PUE = Total Facility Power/IT Equipment Power).
Measuring PUE
It is important that you accurately measure power usage when calculating your PUE to establish a baseline and to measure improvements. Raritan and PTS designed and implemented the following measurement strategy to accomplish this:
A data acquisition device was deployed to measure branch circuit amperage for each load, and calculate power feeding the computer room air conditioning (CRAC) units, lighting, UPS and other non-IT loads.
Environmental sensors were deployed with temperature probes at the top and bottom of each rack (front and back), as well as the supply and return of each CRAC unit. One humidity probe was also deployed to measure the humidity of the room.
Two Dominion ® PX intelligent power distribution units were deployed in each rack. This enabled capture of the active power of the critical IT loads at a per device level. This detailed critical IT load data provides the granularity of information required to measure the PUE and make recommendations on how to improve it.
The Results
Nameplate vs. Actual Power Draw
The histogram to the right shows the frequency distribution of actual power consumed as a percentage of the nameplate values (capacity), based on readings in the Raritan data centre taken from February 21 through February 26, 2008.
Our findings indicate that actual power consumed in operations has such a wide range that it makes it impossible to select a static “nameplate” derating factor that balances efficiency and ensures reliable operation.
For example, measuring average active power, 29 devices in Raritan’s server room consumed between 21-40% of this capacity, while 23 consumed a higher percentage of capacity and 16 devices consumed a lower percentage of capacity. The average of all devices was 39% of capacity. While measuring maximum power, 30 devices in our server room consumed between 21-40% of this capacity, while 38 consumed a higher percentage of capacity and 10 devices consumed a lower percentage of capacity. The average of all devices’ maximum active power was 48% of capacity.
Actually measuring the power a device consumes in live operations provides you with the additional information needed to be more efficient and ensure reliable operations.
Conclusions
Rising energy costs and diminishing supplies were primary motivations for this survey. So over a five-month period, Raritan and PTS defined and implemented a plan to instrument Raritan’s server room and measure the power usage at an IT device level and at a branch circuit level for non-IT load. Based on that survey, Raritan discovered the following:
u Assuming at the outset that their server room had a poor PUE value, Raritan instead learned that it had a respectable PUE value of 1.4, and a total average power load for their IT equipment of 10.5kWh.
• Raritan does not have to add servers to add computational power
• Raritan can avoid a mission-critical disruption by deploying better load management
• Using intelligent rack PDUs gave Raritan the actual power consumption at an individual IT device level, providing the data needed to plan the next steps to become more efficient.
Implementing continual measuring systems also provided a number of unforeseen benefits beyond providing data to make a PUE calculation. A sudden drop in energy usage, for example, led the IT staff to discover that a circuit breaker on one of their CRAC units had tripped. This prompted a repair before hot summer weather arrived and presented a cooling problem. A temperature sensor located high in a rack alerted them to the fact that a rack fan had failed.
Since industry benchmarks are based upon a broad range of averages, and each data centre has its own unique attributes that cause it to deviate from the averages, this Raritan and PTS survey validated Green Grid’s position: “In order to improve the energy efficiency of data centres, it is first necessary to measure the energy consumption of the entire data centre and each of its constituent subsystems.”
Contact details:
T: 020 7090 1390
www.raritan.co.uk