Data center design
In the information age, data centers are one of the most critical components of a facility. If the data center isn’t reliable, business can’t be done. Experts provide insights on cooling and power issues, cloud computing, and energy efficiency.
- Kevin V. Dickens, PE, LEED BD+C, Mission critical design principal, Jacobs Engineering, St. Louis
- Terrence J. Gillick, President, Primary Integration Solutions Inc., Charlotte, N.C.
- Bill Kosik, PE, CEM, BEMP, LEED AP BD+C , Principal data center energy technologist, HP Technology Services, Chicago
- Keith Lane, PE, RCDD, NTS, RTPM, LC, LEED AP BD+C, President/CEO, Lane Coburn & Associates LLC, Bothell, Wash.
- David E. Wesemann, PE, LEED AP, ATD, President, Spectrum Engineers Inc., Salt Lake City
CSE: What challenges does a data center project pose that are unique from other structures?
Kevin V. Dickens: Energy optimization. By definition these structures are energy intensive, but responsible mechanical design demands that we do all we can to minimize the power required to support the IT base load. Hence, the focus on the power usage effectiveness (PUE) metric in all of its manifestations. The follow-on challenge with PUE is simply being secure enough to create and use your own version of the metric (based on the information available to you), and not getting caught up in “PUE envy” and the pointless comparisons and competition. The only thing that really matters is a PUE tool that works in your particular application.
Terrence J. Gillick: While the enclosure is straightforward and sturdy, the mechanical and electrical systems are unique in their capabilities and sophistication. In particular, relative to the footprint of the building, the electrical density is very high, and the mechanical systems include precision air conditioning systems that are significantly more complex than traditional HVAC systems. For a commissioning agent, one of the larger challenges is to validate the building automation and control systems and electrical power management systems to ensure that the sequences of operations defined by the engineer of record are properly implemented and, in turn, to validate that the facility will operate as intended.
Bill Kosik: First, I think the time frame is more like one to three years or less. The industry continues to move very rapidly. It is not cloud computing itself that is driving change, it is what is causing the need for cloud computing—the Internet of Things, as it is called. Everything from your car’s diagnostic system to the temperature in your refrigerator is in the cloud. This will continue to drive a greater and greater need for data centers.
Keith Lane: Data centers are very unique facilities. The sheer amount of power and critical nature of the loads being served require significant expertise. Uninterruptable power supplies (UPS), large standby generators, fuel supplies, large conductors, medium-voltage services, large transformers, various voltages, harmonic distortion, metering, PUE, and energy efficiency all must be considered in the design of data center facilities. Because of the unique nature of the electrical load profile, the heating of underground electrical duct banks must be evaluated. This involves 3-D modeling of the underground feeders as well as a comprehensive failure mode analysis and Neher-McGrath heating calculations. The initial cost of building a data center is tremendous. The long-term costs associated with running a data center include the electrical and water services, which are very significant and must be considered during the design process. The electrical and mechanical engineer must work collaboratively to ensure the most reliable and cost-effective systems are designed and implemented. Enough design time must be built into the schedule to ensure value engineering ideas are fully vetted. Additionally, comprehensive commissioning of the data center should be provided by a third party to ensure all components of the mechanical, electrical, plumbing (MEP), and fire protection system work independently and as a system prior to actually serving critical loads.
David E. Wesemann: Capacity of the utility supply, in particular electrical, is much greater than other buildings. Many times the electrical utility is not prepared for the large loads a data center poses on the electrical grid and many months, or even years, are required for planning and preparation. There are three key items:
- Redundancy and reliability requirements: Data centers normally require a much higher level of reliability that is provided with redundant components and paths.
- Security: Data center facilities, including the infrastructure equipment, have a higher level of security risk associated with them.
- Maintenance: Data center infrastructure components and paths generally cannot be taken out of service for maintenance. Designs must accommodate this.
CSE: When working in mission critical facilities, what’s the most difficult challenge you face?
Gillick: Working in an occupied mission-critical facility—for example, commissioning an upgrade to existing MEP systems—poses the greatest challenge for us in protecting the critical load for patient care, social media, revenue generation, retail sales, or back-office operations. In planning commissioning for these facilities, we define every step of the process in great detail to avoid impacting ongoing operations.
Dickens: After energy, the second challenge is to provide appropriate redundancy strategies based on the facility’s demands, budget, and operating model. It isn’t as simple as working toward the Uptime Institute’s Tier II or Tier III certification. It really needs to be right-sized for the project. The business drivers for a cloud provider, a banking institution, or a command and control facility for the Dept. of Defense are vastly different. The redundancy strategy to support that mission, be it physical duplication, virtualization via software, or mirrored facilities, is just as unique. If you simply “double up” then you are being intellectually dishonest and letting your customer down.
Wesemann: Meeting the reliability and capacity requirements with a limited budget. Many times owners/end users have high expectations for reliability without the budget to support those expectations. Usually this can be overcome by educating the owner/end users with a cost/risk analysis and discussion.
CSE: How will cloud computing affect data centers in the next three to five years?
Dickens: For those of us in facilities, I think the primary impact will simply be more work. But as far as the facilities themselves, the mix in the cabinets may change, but we will still be responding to higher load densities and the need for modularity within the data center. Whatever the cloud is today, it will be something different in the future; that’s the nature of the information age. But the parallel evolution on the facilities side will continue to be seen in expanded environmental parameters and a slow trek to direct fluid cooling.
Wesemann: We can only speculate. Cloud computing will still require data centers, which will still require engineers to design them. There may be some centralization of data centers as cloud computing picks up users; however, as more companies offer cloud computing, they may migrate away from large central data centers to more, smaller data centers. I believe there will always be specialized businesses and services that have unique and critical applications that will want to be kept “in house” rather than rely on remote sites and the limited bandwidth and reliability of the telecomm service providers to communicate between users and applications.
Gillick: Cloud computing servers and storage equipment will require an exponential increase in raised floor space to support the continued growth of cloud computing over the course of the next three to five years. Data center owners, planners, and IT planners are now challenged with determining how much additional space they will need to allocate, lease, and/or build to support cloud computing requirements.
CSE: Please describe a recent new building project you’ve worked on—share problems you’ve encountered, how you’ve solved them, and engineering aspects of the project you’re especially proud of.
Wesemann: One issue we have on new projects is the estimation of IT/server equipment load for the sizing of mechanical and electrical equipment. On one particular project, a data center user requested 300 W/sq ft for IT load alone, not including the power for the mechanical cooling. After much debate and with some reservation, the user agreed to allow the design team to use 150 W/sq ft as the basis of design, with expansion capabilities to 300 W/sq ft. Modular mechanical and electrical systems were used for the initial load with provisions to easily add components if and when needed. The data center has been in operation for about five years and based on the last report, the IT load has never increased beyond W/sq ft.
Gillick: We recently provided commissioning for the first phase of a multiphase greenfield data center construction project for a global company. The first phase comprised 200,000 sq ft of Tier III data center space with a large electrical service. Overall challenges included communicating effectively and succinctly the requirements and goals of the commissioning program, effectively scheduling commissioning activities 18 months in advance, and developing a plan to concurrently commission mechanical, electrical, and control systems. Two critical scheduling milestone goals were established: one, that the building automation system be installed and commissioned prior to functional testing; and two, that a factory witness test of the control systems and graphics be required prior to their deployment at the site. Successful completion of these milestones enabled our team to validate and use all of the trending features of the building automation system (BAS) and the power quality and monitoring system as a reporting tool during functional commissioning and contributed to meeting the aggressive project schedule. Additionally, we provided full lifecycle commissioning services beginning at schematic design and continuing for one year beyond beneficial occupancy. The process required early integration of the building automation and energy management systems and power monitoring and control systems—in particular, an extensive point count.
Lane: Our firm has worked closely with Silent Aire Technology for more than five years enhancing the design of modular data center deployments around the country. There are numerous challenges and numerous benefits to the design, construction, and deployment of modular data centers. Modular data centers are designed and built as a complete system. The entire mechanical and electrical system is built around the client’s IT infrastructure needs and requirements. Modular offsite construction, as opposed to the conventional brick and mortar data center, delivers the following main benefits: speed, performance, and cost-containment. Building offsite in a controlled, safe, and environmentally friendly space may quite often allow for much quicker deployment. In addition to the time savings in building the mechanical, electrical, and structural components, all components and systems are tested in the factory before shipping to the site. This saves time and money during the final integrated systems testing (IST) before hand-over to the client in the field. Optimal performance is more quickly achieved from a modular data center versus a solution built from scratch. The design specifications are tested and verified before the unit ships, thereby delivering immediate quality assurance. There are several advantages to building offsite, but the main reasons are cost control and speed to market. Delays related to weather, site conditions, unreliable or inconsistent labor forces, and labor inefficiencies are greatly reduced or eliminated in a warehouse/prefabrication environment. This leads to reduction in cost and schedule. Additionally, estimating the total cost of the project is typically more accurate. In the end, this represents less risk to the end user. Modular data centers that are factory built can come with either a UL or ETL safety certification label that certifies they have been factory tested and meet the required electrical safety codes and requirements. This safety certification allows the modular data centers to be classified as equipment and not modular buildings, which may often circumvent permitting and inspection requirements that would normally be demanded by the authority having jurisdiction (AHJ) in a brick and mortar build, thus allowing for aggressive and expedient deployments. Other benefits to the modular data centers being classified as equipment are: they can be depreciated as equipment, and opportunities for leasing or financing of the modular data center exist as well.
CSE: Please describe a recent existing building retrofit you’ve worked on—share problems you’ve encountered, how you’ve solved them, and engineering aspects of the project you’re especially proud of.
Gillick: Our firm is in the process of commissioning a new UPS housed in an addition to a Tier IV data center. The project came about as a result of a catastrophic failure in the existing UPS. The data center had been running on emergency generator power. The root cause of the failure had not been identified. A site survey investigation determined that there had been a failure of a low-voltage fuse in a control circuit, which disabled the transfer back to utility power. The root cause was determined to be the low-voltage fuse failure, and a service restoration plan was developed. We conducted an MEP audit of the facility to determine if there were other single points of failure that might affect reliability. Our audit led to recommendations for commissioning and cutover from the legacy UPS system to the new UPS system. We provided peer reviews and commissioning of the new facility. Our team developed a plan to maintain operations for one year on the repaired UPS system during the retrofit project and a phasing plan to cut over to the new UPS in a live operating environment. We anticipate a successful completion of this project.
Dickens: Most of our firm’s projects are in the planning phase, and the conundrum continues to be determining what the future will look like and what infrastructure will be needed to support it. There is no answer to the first part of the question, because no one really knows what’s coming down the IT road. But the second challenge simply forces us to think about data centers, and their form factor in particular, in different ways. I’m proud of the fact that I don’t start a project thinking I know what that building will look like anymore.
Wesemann: We were recently involved with the upgrade of an existing health care data center, designed to 15-year-old standards, to meet the new capacity and energy efficiency standards that today’s data centers require. The challenges and solutions associated with upgrading older facilities include the following:
- Insufficient space to distribute mechanical piping and air. The facility had inadequate floor-to-floor heights to allow sufficient space for distribution. The existing 12-in. raised floor falls well short of being adequate for underfloor air distribution. While still in design, some considerations include moving to in-row air distribution, leaving the underfloor for only piping and electrical (data overhead). Another consideration was to remove the floor and ceiling altogether, leaving concrete floor and exposed structure with more space and accessibility for overhead mechanical and power distribution.
- Inadequate structural design: With greater capacity requirements and limited space for more, larger equipment, the roof became the best candidate for the equipment. However, the existing structural design limited the amount of equipment that could be placed on the roof. The original structural engineer was called back to “beef up” the structure to handle the equipment.
- Replace old equipment with new, while keeping the data center operational. The new system was designed to occupy limited but available space and become fully installed, tested, and commissioned before cutting over from old to new system. Unfortunately, the old system was not set up as an “A and B” redundant system, so the switchover will require a circuit-by-circuit sway of old power to new power. The new system will be A/B fully redundant to make future changes possible while reducing the risk of shutdowns.
- Energy efficiency requirements: Old UPS and mechanical equipment was installed in an era where little consideration was given to sustainable design. New UPS and mechanical equipment has been specified to meet today’s stricter efficiency expectations.
CSE: In a mission critical facility, what is the No. 1 challenge you deal with?
Lane: Data centers take an enormous amount of resources to operate. Energy efficiency of the electrical and mechanical systems is critical. Electrical meters can be used at the service entrance, at the mechanical equipment, and at the UPS systems to determine the amount of power used at various locations throughout the data center. One method used to determine the efficiency of the data center is PUE. PUE is essentially the amount of total power the data center consumed divided by actual power used by the servers. Older, less efficient data centers have PUEs above 2.0. Some very efficient data centers today can have PUEs in the 1.2 range. There are many design protocols that are used in today’s data center that have greatly increased data center efficiency. Some of these protocols include the following: more efficient UPS systems, outside air economizers, contained hot aisles, higher voltages, improved server efficiency, warmer cold isle temperatures, more efficient transformers, controls, variable frequency drives (VFDs), and lighting.
Gillick: Our firm is currently commissioning several of the military’s largest hospitals, so this is fresh in my mind. In a new hospital project, the most significant challenge is commissioning the central plant, which is typically constructed and commissioned before construction is completed on the hospital main building—that is, before there is an actual mechanical load or electrical connections to the hospital main and ancillary buildings. This creates significant challenges not only for phasing the commissioning process, but also for artificially loading the chillers, boiler plant, and electrical system. So we typically bring in compact trailer-mounted boilers to inject hot water into the cooling system to simulate chiller load and resistive load banks to simulate the load on utility services and the future 480- and 600-V electrical distribution system.
CSE: The U.S. Dept. of Energy (DOE) has launched an initiative to help increase the energy efficiency of data centers. Why is this a concern, and how have you dealt with it in your work?
Dickens: The concern is obvious because using more energy simply leads to more problems, including social, economic, and environmental. But as a mechanical engineer, using the least amount of energy to achieve our goals is simply part of the job description. So my approach to systems has not changed, but the relevance and importance of my objective is now shared by the rest of the design team. Because of U.S. Green Building Council LEED, Energy Star, and the U.S. Environmental Protection Agency initiatives, using less energy is not just the energy modeler’s problem anymore. It’s good to share.
Wesemann: Data centers are very large users of natural resources for energy and so any energy savings, even if just a few percentage points, will have a significant impact on overall energy conservation and reduction in the carbon footprint across the globe. It also makes good financial sense as the owner/operators of data centers reap the cost savings of energy-efficient measures.
Gillick: The DOE initiative has challenged engineers and manufacturers to develop more energy-efficient mechanical and electrical systems. On the HVAC side of the house, for example, many of the new computer room air-handling (CRAH) and computer room air conditioning (CRAC) systems comprise energy conservation measures such as variable frequency drives to control fan and pump speeds. On the electrical side of the house, manufacturers are pushing UPS and UPS distribution systems from 95% efficiency to 99% efficiency or greater. These advances have not changed the way engineers are designing the data centers, but they are changing the choices engineers have when designing electrical and mechanical systems and the ability to incorporate more DOE energy initiatives. Similarly, it has become very difficult to meet the federal, state, and local emissions standards and requirements for diesel generator emissions. As a result, we go to great lengths during the commissioning process to validate the emergency generator emissions compliance prior to regulatory testing and the energy efficiency of the products that are being put in place, and to validate that the engineer’s projected PUE meets the owner’s requirements.
Kosik: The DOE has had specialized programs to help industry lower energy consumption in buildings. The more standardization and guidance that is available to the data center community, the better. Many of the DOE programs have rebate incentives through the local utility when lowering energy use.
CSE: Are your data center clients requiring redundancy for all engineered systems?
Gillick: The industry has widely accepted the tier standards of mission critical facilities. Critical facilities are designed to meet Tier I through Tier IV guidelines. Our job as commissioning agents is to validate that the center meets the design standard. At almost every level, some redundancy is required for all engineered systems in data centers—increasing from little or no redundancy in a Tier I data center to 2N+ electrical and mechanical systems in a Tier IV data center.
Wesemann: Depending on the client and application, clients require varying degrees of redundancy. We arrive at a consensus with the client after discussion(s) of the risks and costs of redundancy in the various MEP systems. Another consideration is if the client has redundant backup sites, in which case the reliability requirements of any one site are less stringent since the entire data center itself is backed by another equal copy, which is located perhaps hundreds or thousands of miles away. If the client relies on one and only one data center, then redundancy is much more of a concern. Today we are seeing clients accept a minimal level of N+1 redundancy, while oftentimes going to “2N” or “system + system” redundancy at least in the UPS systems (even though other systems and components may still be N+1).
Lane: We are seeing varying levels of redundancy in modern data centers; 10 to 15 years ago, we would see enormous data centers built to the same redundancy level and the same power density throughout the entire facility. Today we are seeing single Tier II data centers with minimum redundancy for portions of the critical loads as well as Tier IV for other portions of loads. The redundancy level depends on the specific function of the computing task. Very critical loads will be built with 2N topology, while less critical loads will be built with N or N+1 topology. These loads could be in the same room. Additionally, we are building more data centers in a modular fashion—only building the power density required today, but providing for future expansion. This includes provisions for additional UPS modules, standby generators, chillers, and pumps.
Kosik: Yes, but it is done in a much more sophisticated way, which results in lower capital costs and uses modularity to increase future flexibility. Reliability is still paramount, but over the past five years or so, the industry has been focusing much more heavily on the effects of reliability on energy use. With that said, there is a segment of the industry that does not need high reliability in the power and cooling system. Reliability is achieved by shifting to other data centers that are also running the same workloads. In the supercomputing industry, typically only the data storage machines will be on UPS power and have some type of redundant cooling.
Dickens: Not necessarily, although the issue of concurrent maintainability is almost always a given. If we can achieve that with a thermal storage tank instead of an extra chiller, then we do. If there is a sufficient risk tolerance in place to allow a creatively valved and headered piping system to supersede the need of another pump, then we may do it. The key is in understanding the allowable level of risk and then designing to manage to that level. With that said, major equipment is usually redundant and if there are compromises, they are made in the less vulnerable distribution systems.