Keeping critical power reliable

Preventive maintenance and proactive replacement are key to UPS reliability.

By Henry Hu, Emerson Network Power Liebert Services, Westerville, Ohio August 17, 2011

As organizations become increasingly dependent on data center systems, there is a need for greater reliability in critical power systems. For many organizations, the IT infrastructure has evolved into an interdependent, business-critical network that includes data, applications, storage, servers, and networking. A power failure at any point along the network can affect the entire operation—and have serious consequences for the business.

However, a program of scheduled preventive maintenance (PM) and proactive replacement of key components of the UPS greatly reduces the chances for failure during power outages, utility spikes, switching transients, incidents of line noise, and other unexpected power-related issues.

In fact, a study conducted by Emerson Network Power of the impact of PM on UPS reliability revealed that the mean time between failures (MTBF) for units that received two PM service visits a year is 23 times better than a UPS with no preventive maintenance visits. According to the study, reliability continued to steadily increase with additional visits when conducted by highly trained technicians.

While UPS systems are designed to offer power stability and protection at an affordable price, they are not failure-proof. Factors such as application, installation, design, real-world operating conditions, and maintenance practices can impact the reliability and performance of these systems.

A proactive view of service and maintenance in the data center can deliver additional efficiencies. Making business decisions with the goal of minimizing service-related issues may result in additional expense up front, but it could translate into an overall decrease in the cost of ownership throughout the lifecycle of the equipment. 

Original equipment manufacturer (OEM)-recommended maintenance and replacement programs can greatly enhance the availability of your systems and minimize unit-related issues. Well-implemented maintenance programs ensure maximum reliability of data center equipment by providing systematic inspections that can lead to detection and correction of initial failures, either before they occur or before they develop into major defects that can result in costly downtime.

PM has a number of benefits for the end-user. It provides the means to proactively identify areas within the system that could potentially fail and impact the equipment being supported downstream. When implemented on a regular basis, it helps to extend the product lifecycle and optimize capital expenditures for the equipment. In addition, risk management provided at a fixed cost aids in budget preparation and promotes fiscal responsibility.

PM frequency

Typical PM programs include inspections, tests, measurements, adjustments, parts replacement, and housekeeping practices. Based on the study referenced above, at least two PM visits per year are recommended, but the study also makes the case for more maintenance visits for facilities that require higher levels of availability.

The frequency of PM visits also depends on the type of UPS being used in the organization. Small UPS devices should be inspected annually to ensure alarms, filtering, and internal batteries are all operating within specifications. For medium and large systems, which most likely include ancillary equipment, it’s recommended that inspection and maintenance take place at least twice a year to ensure proper function and confirmation that the system is operating within the manufacturer’s specifications. 

Semiannual service

Typical tasks performed during a semiannual service visit include:

  • Checking all breakers, including temperature connections, and associated controls. Discoloration of the component is usually a key sign of hot spots.
  • Visually inspecting subassemblies, wiring harnesses, contacts, cables, and major components and ensuring that all the assemblies are intact.
  • Checking air filters for cleanliness. If the fan’s airflow is restricted, it can greatly reduce life expectancy.
  • Checking circuit boards for signs of discoloration due to heat.
  • Checking power capacitors for swelling or leaking oil and dc capacitor vent caps that have extruded more than 1/8 in.
  • Recording all voltage and current meter readings to ensure they are with specification and make adjustments as required.
  • Measuring and recording harmonic trap filter currents. If certain readings are out of specification, something is wrong with the unit.
  • Check inverter and rectifier snubbers for burnt or broken wires. Wire insulation can become brittle from heat exposure.

Annual service

Typical tasks performed during an annual service call include all the tasks done during a semiannual visit plus the following:

  • Checking all nuts, bolts, screws, and connectors for tightness and heat discoloration.
  • Verifying continuity of fuses on the dc capacitor deck (if applicable).
  • Performing, with customer approval, operational tests of the system, including unit transfer to battery and battery discharge.
  • Calibrating the system to specifications as required and bringing it to system specifications as needed.
  • Installing any engineering field change notices (FCN) as needed to ensure the equipment is up to date.
  • Measuring and recording all low-voltage power supply levels to ensure they are correct.
  • Measuring and recording phase-to-phase input voltage and currents.
  • Reviewing system performance with the customer to address any questions and to schedule repairs.

Periodic replacement

The reliability of a system is directly impacted by the shortest component life in the unit. Some OEMs address this issue by reducing the number of components that need to be replaced, thus decreasing the chances of failure. The reality is that failures still occur; therefore, being proactive with replacement can greatly reduce your chances for downtime.

The UPS contains components that have a limited operating life, which is why a proactive replacement approach is crucial. The typical and most common approach is to only replace components that show signs of aging during a scheduled preventive maintenance event or replace failed ones during a downtime event. Now, companies are taking more proactive measures of replacing life-limited components, such as capacitors and fans, based on their operating life and site operating conditions.

The properties of capacitors, fans, and other electronic components within the UPS are adversely affected when the temperature rises above what the components were designed to operate at. Therefore, a periodic replacement approach of these components helps to increase system availability and reduce the chances of downtime.

Taking the right approach

Maintenance visits and proactive replacement programs have a substantial impact on system reliability, which is more important today as companies look to cut costs while continuing to maintain efficiencies and business continuity in the data center. With today’s heavy reliance on technology and automated systems, disruptions in the data center can have severe impacts on the business. The business case for this service is stronger now than ever before. PM and replacement programs maximize the reliability and performance of the UPS systems on which organizations depend on to keep critical systems running.

Hu is UPS service product manager at Emerson Network Power Liebert Services. Hu is responsible for the creation, development, and maintenance of all service offerings related to UPS and power equipment manufactured by Liebert North America.