Data centers’ intricate design: Automation and controls
Tim Chadwick, PE, LEED AP, President, AlfaTech Consulting Engineers, San Jose, Calif.
Robert C. Eichelman, PE, LEED AP, ATD, DCEP, Technical Director, EYP Architecture & Engineering, Albany, N.Y.
Barton Hogge, PE, ATD, LEED AP, Principal, Affiliated Engineers Inc., Chapel Hill, N.C.
Bill Kosik, PE, CEM, LEED AP, BEMP, Building Energy Technologist, Chicago
Keith Lane, PE, RCDD, NTS, RTPM, LC, LEED AP BD+C, President/Chief Engineer, Lane Coburn & Associates LLC, Seattle
Robert Sty, PE, SCPM, LEDC AP, Principal, Technologies Studio Leader, SmithGroupJJR, Phoenix
Debra Vieira, PE, ATD, LEEP AP, Senior Electrical Engineer, CH2M, Portland, Ore.
CSE: When working on monitoring and control systems in data centers, what factors do you consider?
Eichelman: As a best practice, data centers are routinely provided with very robust monitoring and control systems. These systems monitor all critical points throughout electrical and mechanical plants, giving operators the ability to quickly understand the system status, trend the growth of the demands throughout the system, rapidly recognize and respond to pre-alarms and alarm conditions, and analyze the energy consumed by the plant to ensure equipment and systems are operating efficiently. Automatic controls are commonplace, particularly for the mechanical and generator plants; these systems are typically concurrently maintainable with a very high degree of fault tolerance, depending on the nature of the facility.
Sty: The number of data points available to data center managers from building management systems (BMS) and data center infrastructure management (DCIM) systems can be overwhelming. During the design phase, we collaborate with facilities management (FM) to determine what information from system monitoring, feedback, and alarms are appropriate and will best help them manage the facility. We then follow up with FM during a "post-occupancy optimization" review to see if the feedback is appropriate and what changes need to be made.
Chadwick: Reliability and redundancy are the critical factors that impact the monitoring and control system design. Data center HVAC and electrical systems can be quite complex in terms of layers of backup systems and the need to control the systems in all scenarios. These needs can go against the principles of "keep it simple stupid" (KISS), which also are quite important. There is often a balance to be struck between making a sequence of operations (SOO) complicated to address all scenarios and perhaps gain every last percentage point of efficiency and just making an SOO simple. In most cases, the simple SOO is the best design.
CSE: What types of system integration and/or interoperability issues have you overcome, and how did you do so?
Sty: Referring back to an earlier question concerning trends in data closets in mixed-use buildings, the computer room air conditioners serving MDF rooms are usually decoupled from the base BMS and operate independently to maintain data-hall environmental conditions. Because the CRAC unit controls are proprietary, we also put in additional stand-alone "monitoring sensors" for temperature and relative humidity as an overlay for facilities management. It may be a bit of a "belt-and-suspenders" approach, but given the critical nature of these facilities, the small incremental cost of a few sensors is nothing compared to the assurance that the room is maintained at the proper environmental conditions.
Hogge: We commonly see either IT or facilities departments requiring a specific protocol to interface with an existing system. We’ve recommended a few different solutions to enable the translation of the multiple protocols to interface with the required system. One example is the interface of cabinet-mounted intelligent power strips that commonly communicate simple network management protocol. Many facility managers find the data valuable for detailed power provisioning and cooling evaluation.
Chadwick: We are in the process of designing full data center BMS and monitoring-system networks to allow for a fully engineered system. In reviewing existing data centers, we often find the single point of failure in the BMS network. In a recent site evaluation, we even discovered a network router, plugged into a standard wall outlet in a noncritical space that would have shut down the entire data center. Had the power block been accidentally unplugged (and it was located in a small office IT closet), the whole data center would have been shut down. Sometimes the simplest items can cause the downfall of an otherwise great design.
CSE: What unique tools are data center owners including in their automation and controls systems?
Kosik: In my opinion, one of the most exciting developments in data center power and cooling is the advancement of highly integrated cooling and IT systems. The possibility of truly interdependent operation increases as cooling moves from loosely coupled (like a traditional air-cooled data center) to highly closely coupled (like fully integrated water-cooled servers). As an example, close coupling allows for a much more precise control strategy because the water is in contact (via heat exchangers) with the computer processors, memory modules, and graphics processor. Being able to measure temperatures, along with the respective power states of the internal components, allows for tighter control of the water temperature and flow, which can lead to more accurate predictive control. Although this is one, albeit very simplified, example, there are many more that can be employed to make the data center more reliable and efficient.
Chadwick: As with the rest of data center designs, redundancy and reliability without overly complicating things is the best approach. Adding layers of additional controls and features can overly complicate what need only be a simple control strategy. That said, we are seeing amazing tools for tracking and trending data being developed that will be able to monitor for impending failures. For example, acoustic meters can monitor and sense changes in server noise, which may indicate a pending disk drive or fan-bearing failure. These advanced diagnostic tools are being marketed and have been incorporated into a select few data centers at this point, but the trend is growing.
Hogge: We have clients who are using their traditional BAS to assume some of the roles of a DCIM system. The important design consideration is to clearly define the areas where the BAS stops and a DCIM system picks up. For some clients, the BAS will feed data into the DCIM system, which acts as a single interface for monitoring all MEP infrastructure, security, IT hardware, and network infrastructure.
Sty: DCIM software can provide an overlay and bridge the gap between operations, facility management, and IT. It has become a very powerful tool in data center operation and capital-asset management.
CSE: How has the convergence of automation and controls affected the design of a data center or data closet within a building?
Chadwick: As noted above, advanced controls can allow for complex control scenarios, which may be necessary to achieve ultimate efficiencies. However, these can come at the expense of reliability. Careful evaluation of the costs (in reliability) versus the benefits (in efficiency) need to be performed as new system design is incorporated into the mission critical field.
Hogge: The lines become blurred in many respects between what a traditional BAS and a DCIM system offer. Depending on the vendor, there can be many overlapping capabilities of the systems. However, the benefit is being able to have more insight on the operation characteristics of the equipment and systems serving the critical systems.
Eichelman: With the increasing convergence of IT equipment and systems with the physical plant, DCIM systems are becoming more common in data centers. These systems allow the IT assets, as well as the physical plant that supports the IT environment, to be monitored, managed, and controlled from a single integrated system overlay. DCIM systems provide the ability to optimize the operation of the IT equipment, in conjunction with the electrical and mechanical system, to ensure the continued reliability of the facility while minimizing energy consumption.