Designing efficient data centers: Automation and controls

In today’s digital age, businesses rely on running an efficient, reliable, and secure operation, especially with mission critical facilities such as data centers. Here, engineers with experience on such structures share advice and tips on ensuring project success in regards to automation and controls.

By Consulting-Specifying Engineer April 25, 2018

Respondents

Doug Bristol, PE, Electrical Engineer, Spencer Bristol, Peachtree Corners, Ga.,
Terry Cleis, PE, LEED AP, Principal, Peter Basso Associates Inc., Troy, Mich.
Scott Gatewood, PE, Project Manager/Electrical Engineer/Senior Associate, DLR Group, Omaha, Neb.
Darren Keyser, Principal, kW Mission Critical Engineering, Troy, N.Y.
Bill Kosik, PE, CEM, LEED AP, BEMP, Senior Engineer – Mission Critical, exp, Chicago
Keith Lane, PE, RCDD, NTS, LC, LEED AP BD&C, President, Lane Coburn & Associates LLC, Seattle
John Peterson, PE, PMP, CEM, LEED AP BD+C, Program Manager, AECOM, Washington, D.C.
Brandon Sedgwick, PE, Vice President, Commissioning Engineer, Hood Patterson & Dewar Inc., Atlanta
Daniel S. Voss, Mission Critical Technical Specialist, M.A. Mortenson Co., Chicago


CSE: From your experience, what systems within a data center are benefiting from automation that previously might not have been?

Peterson: Due to automation, the data center’s overall cooling operations can be based on the operational history to widen the opportunity for free cooling instead of relying on the operational staff. However, automated systems will need to learn what the operators have already experienced—system reaction times, changes to the overall power-use profiles, and adjusting for anticipated issues. In many ways, automation needs to be “mentored” until it has enough on-the-job experience to be trusted 100% with decisions.

Voss: Some data centers are almost hands-off, with no human interaction unless the automated systems can’t correct the situation themselves. In addition, the ability to control all electrical and mechanical equipment from one central location can provide the operations team a level of control unmatched elsewhere in other types of buildings.

Cleis: Based on the feedback we have received, the power metering and temperature monitoring at the racks is beneficial to owners and data center managers. This information is also valuable to design engineers. It further allows owners and managers to react when they see changes in data to help prevent what could become larger issues. Like many things, having more real-time data can provide benefits to technically sophisticated owners, managers, and engineers.

Sedgwick: Any system can benefit from automation, especially critical systems with inherent complexity and danger. When designed and implemented properly, automation provides tighter process control with less human interaction, which improves efficiency, predictability, and safety while reducing the risk of human error. In a data center, that translates not only to improving safety and maximizing uptime but also to reducing burdens on higher-value facilities staff. Many facility operators are responsible for multiple sites, each containing dozens of complex systems. Automation empowers facility staff by eliminating costly and risky manual tasks without sacrificing safety and reliability. Programmable logic controllers and microcontrollers produce the same results every time regardless of complexity, frequency, or fatigue-inducing monotony.

Another benefit of automation is that real-time and historical data is collected. Facility operators can then analyze this data to predict potential failures, facilitate preventive maintenance, determine equipment lifecycles, and evaluate other useful metrics.

Effective automation depends on programming quality, equipment capabilities, effective integration and communication between automated systems, and a thorough commissioning effort to document and verify system performance. Commissioning also instills confidence in the mind of the owner that the automated infrastructure can be trusted to perform as intended. It’s surprising how many facilities invest in automation but continue to operate manually simply due to a lack of confidence.

CSE: The Internet of Things (IoT) is increasingly on owners’ minds. How have your engineers worked with building owners and managers to implement such integrated technology?

Voss: BIM has been used for customers to use for training their new team members and. These models can be uploaded to the cloud and made available on mobile devices throughout the facility.

CSE: When incorporating IoT, what are some of the most pressing challenges?

Voss: It’s a challenge for all shareholders to come to a meeting and establish a complete list of what needs to be integrated—that should be integrated—and then determining the best path to make it a reality. Keeping each item of the integration totally secure from external and internal breaches keeps many IT professionals up at night.

CSE: Cybersecurity and vulnerability is an increasing concern—are you encountering worry/resistance around wireless technology and IoT as the prevalence of such features increases? How are you responding to these concerns?

Peterson: There is resistance to using wireless technology of any kind in the data center, including temporary/adjustable sensors that only provide monitoring and no control. With ever-increasing changes in software and technology, even highly secured, proprietary, and encrypted wireless solutions face scrutiny from clients that previously were unconcerned. Through lessons learned by others, we have kept to a minimum the incorporation of equipment security into controls systems for power and cooling systems.

CSE: What design features should be incorporated to help prevent cyber attacks?

Gatewood: Since the Stuxnet blowback beginning in 2010 and exemplified in the Ponemon Institute study “Cost of Data Center Outages,” cybercriminal activity has quickly come to compete as the No. 1 cause of outages. According to the study, UPS failures account for 25% of outages and cyber attacks account for 22%. Human error accounts for 22%; cooling failure for 11%; weather for 10%; generator failure for 6%; and IT equipment failure for 4%.

Depending on the organization’s IT work product, the cost of a cyberattack ranges between $80,000 to as high as $2.5+ million per event (2016 costs), excluding soft costs such as reputational losses. Cyber crime accounted for only 2% of outages in 2012 but rose to 22% in 2016. To combat the threat, we assess and reassess the connectivity and the specific control needs of each interconnected system. Vendor remote equipment access via a simple network management protocol (SNMP) and RS-485 are no longer a given. Clear coordination with industry leaders places these features behind firewalls and seeks to control who needs to know.

Voss: Beyond working with IT departments to create network addresses using customer naming requirements, we have not worked with design features that help prevent cyber attacks.

CSE: What types of system integration and/or interoperability issues have you overcome in data centers, and how did you do so?

Voss: Electrical and mechanical equipment status reporting along with all associated circuit breakers and valves (position of each) in real time has been, and is still, a key hurdle to overcome. At one enterprise site, the maximum amount of time allowed to observe a changed status via either the EPMS or building automation system (BAS) was specified to be only 7 seconds. After Level 4 commissioning was completed, we tested the EPMS and BAS, which indicated that the longest time to verify a changed status was 1 minute 28 seconds—not even close. We pulled the design team and construction teams together and started to analyze each portion of both systems. We agreed to add more gateways, which reduced the quantity of reporting points per gateway by half. We also reviewed the wiring and determined to decrease the gauge of wire even in longer runs, which goes against voltage-drop common sense. After a few more timing events, as the time dropped below 1 minute 37 seconds, the teams continued researching and doing trial-and-error work to reduce the speed of communication between various devices. This brought the actual time down to 21 seconds (from the most remote device), which was the fastest we ever achieved. The owner and design team agreed to accept the 21 seconds in lieu of the specified 7 seconds.

Sedgwick: The number of system integrations needed in any given data center can be staggering; mechanical and electrical systems that were once isolated from each other are now often integrated. However, many systems still depend on legacy protocols, such as Modbus, LON, BACnet MSTP, and others. These older protocols are especially challenging due to a lack of standardization—every manufacturer had its own little twist on field mapping and communication conventions. We overcome this by maintaining our own libraries of “known good” integration databases that can be repeated from site to site, which has allowed us to mitigate issues that were once commonplace. With its faster data-transfer speeds, Ethernet communication has had a profound impact on system integration, communication, and remote monitoring and remediation. Distributed control systems, in particular, have benefited from this advancement, since they typically use redundant master processors that enable centralized programming and decision-making, rather than multiple controllers relying on timers and inconsistent programming methods to synchronize. While communication speed has improved, there is still a need for communication protocol standardization.

CSE: As data centers evolve and become more advanced, what kind of automation solutions can be implemented so data centers can be managed more proactively to prevent disruptions and increase efficiencies?

Peterson: Following the weather is one of the main sources where automation can use this information to better manage the overall operations. This allows the building as a whole to review past performance and find ways to reduce peaks and overall energy use. Additionally, storms and other events that can affect the data center can help everyone prepare for the worst and recover quickly. Beyond weather, automation can track the operational status of all of the equipment and can act beyond a typical BAS to be more proactive with maintenance and replacements. As the automation system learns about peak operations, it can also direct staff as to when the best times would be to take equipment out of service, perform construction, or even swap out IT equipment.

Gatewood: We have been exploring adaptive Power over Ethernet-enabled damper/actuators to tie airflow requirements to the actual demands of the rack or racks it supports. This will lead to optimized pressure and airflow management, placing cooling exactly where it’s needed as the environment changes. Analytics and low-cost sensors are making the approach appealing.

Voss: Using automation to fully operate the electrical and mechanical support systems for daily operations as well as during an emergency would help improve operations. The facilities staff should diligently monitor the automation systems to maintain 100% uptime.