Breathing New Life Into Data Centers
Retrofitting operating data centers is always tough. While there is no single answer, there are many products and solutions that can be used to help retrofit overloaded cooling systems. Close-coupled cooling systems, in conjunction with creating aisle containment, can add to the cooling capacity of the data center, while reducing its power usage effectiveness (PUE). This article explores some of the innovative designs and equipment available to improve cooling and overall data center efficiency.
Many data centers built not too long ago have reached the breaking point with their cooling systems and need to be retrofitted to support next-generation IT equipment. The data center's power envelope is its only true limiting factor. As long as there is power available to the data center, the cooling problem can be solved no matter what the kW/rack densities are. In most cases, after the retrofitting is completed, the cooling system will be far more efficient than using traditional legacy cooling solutions.
Historically, data centers have been cooled by computer room air conditioner or air handler (CRAC or CRAH) units placed along the walls, feeding cold air underneath a raised floor that serves as a plenum. Perforated floor tiles (see Figure 1) control where the cold air is delivered to racks. When used with raised floors, cooling by CRAC or CRAH units is effective to densities of approximately 2 to 3 kW/rack (approximately 60 to 90 W/sq ft). Due to ever-increasing densities, raised floors have needed to cool 4 to 6 kW/rack (120 to 180 W/sq ft); however, that requires higher raised floors and larger cold aisles, making it difficult to retrofit an existing data center. Even at 2 kW/rack densities, this technique becomes inefficient and unpredictable as floor tile airflow becomes a factor, and the ability to adequately pressurize the raised floor and create a uniform airflow pattern underneath the raised floor space becomes critical.
To make up for non-uniform cooling, data center operators usually reduce the overall data center temperature to overcome the random intermixing of hot and cold air. This approach not only reduces overall cooling capacity, but also wastes energy by creating colder-than-needed data centers. Additionally, it can create demand-fighting between CRAC units while increasing the dehumidification and humidification processes. In lightly loaded areas, the CRAC units can go into heating mode due to low air return temperatures in the data center space. In many cases, data center managers have added more CRAC units, believing additional capacity is needed to overcome hot spots in the data center, further contributing to wasted energy. CRAC units would not have been needed if mixing was not occurring. Up to 60% of the air flowing around a data center cooled this way is doing no work. This equates to a lot of wasted fan power—energy being used to accomplish no useful work and design capacity that cannot be realized.
One of the first steps in retrofitting existing data center cooling is to create controlled airflow paths if they don't exist. Supply and return air paths should be segregated all the way from the conditioning equipment to the IT equipment. This is typically done by containing either the hot or cold aisles with specialty vinyl curtains, ducting systems, custom-made walls or ceilings, or commercially available containment systems. These containment practices require airflows to pass through the IT equipment before returning back to the units, preventing air mixing and short circuiting. In some cases, data centers that seemed to be out of cooling capacity have actually been able to turn off CRACs after a containment system was installed, with no new cooling equipment deployed.
The PUE serves as a reminder that the energy cost of cooling a data center is second only to the cost of powering the IT equipment itself. The increased attention on energy efficiency, combined with the limitations and inefficiencies of raised floor approaches for increasing IT equipment densities, as well as skyrocketing costs, are leading to rapid innovation in the area of data center cooling.
By greatly increasing the effectiveness and efficiency of the cooling system, a data center that appears to be out of power can find more power for IT. For example, in one of Sun's data centers, we achieved a PUE of 1.28. When compared to the U.S. Environmental Protection Agency (EPA) estimated data center industry average PUE of 2.0, this data center is using only 64% of the power of one that was traditionally designed. The majority of gains are in the cooling system. By replacing old inefficient CRACs with new close-coupled solutions, the data center can more efficiently support much higher density loads while reducing overall power usage.
CLOSELY COUPLED COOLING
The challenge in today's data center is how to handle increasing overall loads and eliminate hot spots created by high-density racks using up to 30 kW each. The best solution is to move the cooling of IT equipment as close to the equipment as possible to neutralize the heat at the source.
Modern data center cooling systems should be divided into two distinct equipment classes. Room-level cooling or base-cooling equipment is installed to handle the heat load created by the room itself—such as building loads, lighting, and people—and to control room dew point and particulates when required. IT equipment is cooled through close-coupled, in-row, overhead or in-rack cooling devices designed specifically for sensible cooling. This significantly increases cooling efficiency at the rack, while greatly reducing the overall energy used by the cooling system.
Closely coupling the cooling with the IT equipment creates a very predictable airflow model and typically does not require computational fluid dynamics (CFD) analysis to visualize. Predictability helps eliminate hot spots in the data center. This allows the overall data center temperature to be raised because the cooling system no longer has to overcompensate for the worst-case scenario. System efficiency is increased because cooling is closely matched to IT equipment airflow requirements dynamically throughout the data center. The higher data center temperature that this approach facilitates also increases the temperature differential between the air crossing over the cooling coils and the cooling medium running through them (chilled water or refrigerant). This further increases the heat removal capabilities of the cooling coils by increasing heat removal efficiency. Now the entire system does not have to work as hard to cool the equipment.
CLOSE-COUPLED COOLING INNOVATIONS
The closer the heat-removal mechanism can be tied to the heat source, the more efficiently it can remove heat from the data center. The market is moving toward a richer set of close-coupled cooling techniques, as well as approaches that more precisely target their cooling capacity.
Pod-level cooling: APC. A pod is a group of 20 to 24 racks sharing a common aisle. The closely coupled cooling solutions Sun has deployed have been at the pod level using products from both APC and Liebert (see Figure 2 and Figure 3). APC was the first to provide this type of highly efficient solution before data center efficiency became a focus. APC solutions use cooling units mounted in a row with the server racks and have an integral hot aisle containment system.
Rack-level cooling. Some innovations move cooling to the racks themselves, neutralizing heat before it can leave the cabinet. The Sun Modular Data center (MD) is a data center built inside a shipping container (see Figure 4). The IT equipment in the Sun MD is cooled by an active, water-cooled system that neutralizes heat as it leaves the rack. Other passive rear-door heat exchangers attach to the back of a standard 42 U rack and use server fan pressure to push air through the heat exchanger. Likewise, we are starting to see rack-level cooling systems designed to work directly with specific equipment so that a rack that consumes even 50 kW or more could return air to the data center at its ambient temperature.
Direct CPU cooling. The most focused closely coupled system cools server components directly. These solutions may become necessary as densities are pushed higher, and the quest for ever-increasing cooling efficiencies continues. By cooling server components directly, heat transfers can occur at much higher temperatures while reducing or eliminating the need for fans internally and externally to servers.
Economization . While economization is not necessarily a close-coupled cooling technique, many facilities are working on ways to eliminate chillers and in-room cooling units altogether. Using free-cooling techniques, such as air-side or water-side economizers, and exploring the ability to use higher-temperature water in data centers, will continue to drive innovation in this area. The evolution of air-side economizers is taking hold in data center environments with the potential for huge energy savings. When external environmental conditions allow air-side economizers to be used, they should be designed into the data center cooling system. Concerns about their use typically are solved with simple engineering. As the accepted internal environmental requirements for data centers continue to relax, the hours of free cooling will greatly increase and can greatly reduce operational costs when the chiller plant is turned off.
Pod-style, close-coupled cooling. Like APC, Liebert has introduced an increasing number of products to directly assist in pod-style, close-coupled cooling. Liebert's XD system offers various over rack system components and an XDH Horizontal Cooling module that operates like the APC In-Row RC (see Figure 5) units using refrigerant instead of water.
In a drive for even higher efficiency, both systems have integrated controls that vary fan speeds in the cooling modules based on actual pod cooling needs. To continue the drive for low implementation and operating costs, Sun has created new designs for its data centers using draping systems to contain either the hot or cold aisle, in conjunction with both APC and Liebert cooling equipment (see Figure 6).
CHILLED WATER VERSUS REFRIGERANT
Most medium- to large-sized data centers use a chilled water system to deliver cooling capacity to the data center perimeter. The choice of running chilled water at least to the data center perimeter is a compelling one. Water (or a mixture of water and glycol) has sufficient density for efficient heat transfer. It can be cooled using a standard chiller plant or through evaporative cooling.
There is significant debate, however, regarding whether to extend the chilled water system onto the data center floor. Ironically, water has been in data centers from the beginning. Mainframes required water piped to them. Water-based fire-suppression systems are used to protect many data centers, including many of Sun's data centers. Facilities engineers tend to be less concerned about water extended into the data center; among IT engineers, the perception is that this should never be done.
Water extended to the data center floor is simply an engineering problem. As with any engineering effort, good designs will prevail. Flexible pipes, double sleeving, isolation valves, leak sensors, and a solid building management system helps to minimize risk. Today, organizations can choose to extend the chilled water loop directly to localized cooling units in rack rows, or they can use the chilled water loop to condense refrigerant that is circulated throughout the rack rows instead.
Despite the fact that both water and refrigerant piping are pressure tested before commissioning, some organizations have a strong preference for or against water circulating in close proximity to computing equipment. Sun uses both chilled water and refrigerant in its data centers, even on slab, and has not had any outages or problems related to water leaks.
Ryan is a senior staff engineer, Sun Microsystems Inc., Global Lab and Data Center Design Services group. He has been involved in the design, construction, and operation of mission critical facilities for the last 18 years. For the last 10 years he has focused on the design and operation of mechanical and electrical infrastructure systems supporting high availability data centers.