Redundancy in Control

How is redundancy implemented for PC-based control? The first step is defining a physical interface to the real world that will provide multiple computers controlling the same I/O system.One way to provide multiple "controllers" is to implement a "mirroring backup" so that another system also collects all data.

By Gint Burokas February 1, 1998

How is redundancy implemented for PC-based control? The first step is defining a physical interface to the real world that will provide multiple computers controlling the same I/O system.

One way to provide multiple “controllers” is to implement a “mirroring backup” so that another system also collects all data. If the system is controlling in real-time, having more than one CPU in the system is ideal, creating a bumpless control system. This is a common requirement for redundancy.

For example, if the CPU fails in a system controlling a furnace, waiting for a replacement might destroy the product. Continual process and CPU data exchange creates redundancy to ensure system reliability.

Taking advantage Microsoft Windows NT robustness and including features for redundancy in the software are key challenges for PC-based control in factory-floor automation. To provide redundancy, PC-based control software acknowledges more than one CPU on the system, whether it is distributed or centralized control.

Redundancy is achieved by updating one CPU with all I/O states, while another CPU interrupts to do control. Through this transition, the software can identify which CPU is doing the control. Systems with PC-based control on two systems need to share data, while one does the control and the other receives data. The two CPUs cannot compete with each other. CPU ” hand shaking ” tells the control software which CPU system is operating.

Hot swapping

Hot swapping , similar to redundancy, allows the user to unplug one hard drive while on-line, allowing another drive to take over. Some software can synchronize a state by passing the data space from one machine to another. In case of failure, control switches to the backup processor.

With computer-based systems, redundancy is used for data storage and file servers. To secure data in file systems, computer systems may use a redundant array of inexpensive disk drives, or techniques like disk mirroring , which protects data by duplicating it on more than one disk drive. Industrial PCs use a paired drive or power supply that can be hot swapped or exchanged during operation. Networking uses redundancy to tolerate failures, to in- crease likelihood of meeting tight time- constraints, and to ration (based on task priorities) limited system bandwidth .

For such time-critical systems, redundancy is employed to secure the required bandwidth and fault-tolerance . Supervisory control and data acquisition (SCADA), and distributed control systems use redundancy for sharing information and data. If either server fails, the redundant system still gathers information. Some software allows viewing of plant activity from a human-machine interface, or view node, should a SCADA node become unavailable by channelling data requests to a backup SCADA node.

Programmable logic controllers (PLCs) also incorporate redundancy. At one level, the system employs multiple PLCs controlling a single I/O bus so that if either CPU fails, back up takes over.

PLCs can also be redundant when one I/O point connects two identical systems of a PLC, CPU, or I/O interface card. Care must be taken to avoid confusion when reading or writing inputs and outputs. PLCs also use redundancy with multiple CPUs and power supplies. Redundancy may require communicating among multiple CPUs and distributing output data back to servers. I/O-based hardware can also provide redundancy.

While hardware can provide redundancy, we find that software provides fail-safe redundancy for mission-critical operations.

Author Information

—Gint Burokas, senior software engineer, Intellution Inc.’s Wizdom Controls, Naperville, Ill.

TERMS

Bandwidth: range (usually Hertz) over which a system operates.

Bumpless: ability to change processors controlling a process (changeover) without affecting the process.

CPU: central processing unit.

Data space: where data reside.

Disk mirroring: data protection by duplication on disk drives.

Fault tolerance: design which allows continued system operation with some level of malfunction.

Hand shaking: contact among or between CPUs for identification.

Hot swap: exchange of components during operation.

RAID: redundant array of inexpensive disk drives.

Redundancy: duplication to enhance reliability.

Synchronize a state: ensuring frequencies of two systems are equal.