Avoid data center failure

Emerson Network Power Announces Best Practices to Avoid Data Center Failure by Human Error

08/12/2010


Emerson Network Power has released Best Practices to Avoid Data Center Failure by Human Error. Given the risk that human error poses to the data center, those who enter these facilities must adhere to rules and policies to prevent disasters.

 

Best Practices to Avoid Data Center Failure by Human Error:

1.     Shielding Emergency OFF Buttons – Emergency OFF buttons are generally located near doorways in the data center. Often, these buttons are not covered or labeled, and are mistakenly shut off during an emergency, which shuts down power to the entire data center. This can be eradicated by labeling and covering emergency OFF buttons to prevent someone from accidentally pushing the button.

2.     Documented Method of Procedure: It is the answer to many unforeseen human errors. This documented step-by-step, task-oriented procedure mitigates or eliminates the risk associated with performing maintenance. Do not limit the procedure to one vendor and ensure back-up plans are included in case of unforeseen events.

3.     Correct Component Labeling: If protection devices, such as circuit breakers, are not labeled correctly, this can have a direct adverse impact in keeping data center load up. To correctly and safely operate a power system, all switching devices must be labeled correctly, as well as the facility one-line diagram to ensure correct sequence of operation. Procedures should be in place to double check device labeling.

4.     Consistent Operating of the System – Sometimes data center managers get too comfortable with operating the systems, do not follow procedures, forget or skip steps, or perform the procedure from memory and inadvertently shut down the wrong equipment. It is critical to keep all operational procedures up to date and follow the instructions to operate the system.

5.     Ongoing Personnel Training – Ensure all individuals with access to the data center, including IT, emergency, security and facility personnel, have basic knowledge of equipment so that it’s not shut down by mistake.

6.     Secure Access Policies – Organizations without data center sign-in policies run the risk of security breaches. Having a sign-in policy that requires an escort for visitors, such as vendors, will enable data center managers to know who is entering and exiting the facility at all times.

7.     Enforcing Food/Drinks Policies – Liquids pose the greatest risk for shorting out critical computer components. The best way to communicate your data center’s food/drink policy is to post a sign outside the door that states what the policy is, and how vigorously the policy is enforced.

8.     Avoiding Contaminants – Not keeping the indoor air quality clean can cause unwanted dust particles and debris to enter servers and other IT infrastructure. Much of the problem can be alleviated by having all personnel who access the data center wear antistatic booties, or by placing a mat outside the data center. This includes packing and unpacking equipment outside the data center. Moving equipment inside the data center increases the chances that fibers from boxes and skids will end up in server racks and other IT infrastructure.