The JOURNAL

The Matrikon Alarm Manager integrates with the FactoryTalk® software suite to provide context-sensitive access to any available alarm information so users can turn data into actionable information.

The JOURNAL

Web Exclusive

The What, Why, Where, Who and How of Alarm Management

Learn how to evaluate and optimize your alarm system to help avoid production losses, equipment damage and injuries during critical incidents.

By Vikas Grover, Product Manager, Matrikon Inc.

Today's process control systems have made it possible to create alarms more easily and at a lower cost. Although software alarms are convenient, the ease with which they can be created removed the incentive to limit alarms. As a result, operators now are faced with more alarms than they can effectively monitor. Alarm management helps you to identify unnecessary alarms and alarms set at the wrong value, and where improvements can be made to the current procedures for dealing with alarms.

What is Alarm Management, and Why Do I Need It?

Alarm Management comprises a collection of techniques, tools, standards and procedures that improve the operations of process plants by improving the effectiveness of alarm systems. The two areas that have historically impeded gains in alarm management are: More >>

As a result, more alarms could be configured for lower cost. This meant safer operations, but it also relaxed the engineering controls on the creation of alarms. Since there was no cost to implement an alarm, no incentive existed to limit the number of alarms. Naturally, some of the alarms that were configured were unnecessary, or were set at the wrong value. This leads to nuisance alarms — alarms that don't tell the operator anything he didn't already know, or which require no action.

Since the 1970s, more and more process units have been centralized under the control of fewer panel operators, so each operator has become responsible for more alarms. In addition, the number of alarms that can be configured on a single measurement has ballooned. Alarms now are commonly set on Low, High, Low-Low, High-High, Deviation, bad value and sometimes other values.

An operator now is confronted with tens of thousands of configured alarms within his area of the plant. During upsets, hundreds or thousands of these alarms can occur in a very short period.

In addition, organizations such as the Occupational Safety and Health Administration (OSHA) and U.S. Environmental Protection Agency (EPA) or voluntary programs such as Responsible Care, ISO-9000 and ISO-14000 require periodic process hazard analyses or process assessments. These all result in the creation of additional alarms.

When too many alarms go off during an upset, they distract the operator and conceal the actual nature of secondary problems, instead of alerting the operator to real problems.

So, in the absence of an alarm management program, manufacturing plants become less safe rather than safer, incidents become worse rather than better, and production losses increase.

What Does Alarm Management Include?

Alarm Management consists of the set of procedures, practices, tools and systems that jointly ensure that the plant's alarm system is as effective as possible. These can include: More >>

Who Needs Alarm Management?

To determine if your existing alarm management program is sufficient, ask yourself the following questions: More >>

If the answer to more than one of these questions is "I don't know" or "yes", then your alarm management system needs at least an assessment, and probably improvement.

What Will Alarm Management Do For Me?

Without effective alarm management, you can't be certain your operators will effectively respond when an upset occurs. Alarms should define the boundary between normal operation and abnormal operation. The figure shows this relationship: More >>

A properly applied alarm management program ensures that the alarms are:

How Do I Do It?

You generally can take two routes to improved alarm management: More >>

  1. Continuous improvement of normal operations.
  2. A new alarm management program.

Both approaches start with an assessment. The Alarm Management Assessment (AMA) is an evaluation of your current operation by trained, experienced personnel who examine all aspects of your existing alarm management systems. The AMA evaluates your exposure to risk, and provides you with a grade in each of the different areas of alarm management, as well as an overall grade for your plant. Depending on the grade, the AMA recommends one of two different courses in each area.

The areas of alarm management evaluation are:

If the existing practice in an area receives a grade of A or B, then no urgent need for improvement exists. You may need some management systems to ensure that performance is maintained or continuously improves over time; however, large amounts of effort are unnecessary in that area.

If the existing practice receives a grade of C or lower, then an organizational need to improve in that area exists, as described in the following sections.

How Do You Improve Alarm Quality?

You can use two approaches to alarm improvement. The appropriate approach depends on the grade from the AMA. If the existing definition, change management and operator readiness processes are effective, then the alarm system can be easily improved with continuous monitoring and continuous improvement. If the support processes aren't in place, a more thorough approach is necessary. More >>

Continuous Improvement. If all is well, or if you have other organizational priorities, you can apply a continuous improvement philosophy. You can greatly reduce nuisance alarms during normal operation by using this method. However, this approach is only effective at improving and maintaining an already well-managed alarm system.

  1. Collect data. Collect alarm history for at least one month, including all alarms for all consoles along with all operator actions. This data will provide both an original benchmark and the basis for alarm analysis.
  2. Analyze. Most nuisance alarm occurrences are caused by a surprisingly small number of actual configured alarms. The initial alarm analysis will quickly identify alarms that need reconfiguring. It also will provide insight into the severity of the problem.
  3. Benchmark. Analyze the alarm history to determine whether the rate of occurrence is within the EEMUA guidelines for standing alarms and alarm occurrences per shift, and for alarm bursts. Analyze alarms area by area because some operating units or areas are more prone to alarms than others. Using the alarm history, measure the original performance as a benchmark for comparison to show improvement in nuisance alarms over time.
  4. Spend one or two days per month doing alarm management. As more alarm history is collected, a senior operator and an engineer should spend a day or two each month to find the worst nuisance alarms and reconfigure them in the HMI.
  5. Measure. Confirm that improvements are being maintained. The monthly alarm occurrence statistics will show the frequency of alarms and the number of standing alarms decreasing over time. This can be powerful evidence of an effective alarm management program.

An Alarm Management Program for the Future. This approach is similar to the Six Sigma program: First the problem and objective are defined, then the current state is measured and analyzed, action is taken to improve the situation, and finally the situation is automatically monitored to control or sustain improvement.

This approach helps to produce a self-sustaining alarm management competency within the organization, and to bring the initial benefits of alarm management as quickly as possible:

  1. Create an alarm philosophy. Define the purpose of each type of alarm or message that is issued by an automated system. Specify the criteria for each priority, for hardware alarms and for when an alarm is needed at all. Best practices in this area include integrated risk measures and addressing human factors and cognitive limitations of human operators.

    Define other alarm management policies, such as business rules governing alarm changes, inhibiting or suppressing of alarms, expectations of operators, and so on. This step produces a document that clearly defines the objectives of the alarm system and ground rules for its implementation.
  2. Benchmark alarm performance. Assess the actual current state. This provides useful data for the following analyses, and also provides a benchmark of the original state of the organization. Capture HMI alarms and events, track changes and display original engineered values compared to what is on the HMI/PLC with master alarm database provided by solutions such as the Matrikon Alarm Manager from Matrikon, Inc., a participating Encompass™ Product Partner in the Rockwell Automation PartnerNetwork™.
  3. Decide which areas require rationalization. Alarm rationalization, or alarm objectives analysis (AOA), is a procedure in which each alarm is examined to ensure it conforms to the alarm philosophy. During this process, many alarms are eliminated, others have their priorities changed, and still others have their trip points altered. The process can be time-consuming, so it's best to start in those parts of the plant that have the worst problems with alarms.
  4. Implement an alarm configuration system. After choosing a starting point, implement a database to collect, retain, control and present the information collected during the AOA process. This database will form both the change management system and the online operator assistance.
  5. Commit resources. This team will need to work together for several weeks, so free them of most of their other responsibilities for the duration of the analysis. Form a team consisting of:
    • A facilitator, familiar with the AOA process and with alarm analysis. At least one area operator, preferably two.
    • The area process engineer.
    • The area production supervisor.
    • An area instrumentation technician.
  6. Conduct AOA meetings. Use analysis reports as evidence of problems or situations. Review the configured alarms and answer several questions about actual alarm occurrence and the purpose for the alarm. For example:
    • Possible causes for the alarm, whether legitimate (providing information), spurious (misleading or nuisance) or redundant (telling the operator something he already knows).
    • The procedure to identify the cause and thus validate the alarm.
    • The procedure to mitigate the different causes.
    • The time frame in which the operator has to respond, before an undesirable consequence occurs.
    • The follow-up action required to verify that the procedure was effective.
    • The historical frequency at which the alarm occurred, and how many occurrences were spurious or redundant.
    • The final alarm trip point and priority.
    During the AOA process, also consider alarms that don't exist, but should.
  7. Reconfigure HMI. Implement the changes from an AOA carefully; don't do it piecemeal. An AOA results in an alarm system that is designed to alert, inform and guide the operator. The set of alarm configurations work together to define the boundaries of normal operation — if AOA configurations are made piecemeal, some alarms may be removed before alarms intended to replace them have been configured. In that case, the plant will be less safe.

    It's also important to introduce continuous monitoring applications identified during the AOA. Continuous monitoring applications, such as multivariate trends, and provide the operator with improved visibility into the process without having to rely on alarms as the operator's "eyes and ears."

    Much of the barrage of alarms during upsets is a result of the misuse of alarms to alert the operator that an expected event has occurred. Continuous monitoring restores the alarms to their original function of identifying unexpected events that require a response.
  8. Address human factors. Examine factors such as design of the operator interface in the HMI to the number of independent systems that emit alarms, the audible alarm itself and when (and whether) it can be silenced.
  9. Continue to measure and sustain alarm system effectiveness.

Just Good Business

Alarm management is just good process plant management. Advances in technology and increases in government regulation have jointly made alarm management more important and more difficult. Poor alarm systems contribute to production losses, equipment damage and injuries during critical incidents. You often can improve your alarm management without major costs. More >>

For more information about alarm management and Matrikon Inc. and its Rockwell Automation FactoryTalk-compatible products, visit www.rockwellautomation.com/go/p-matrikon.

Management Responsibilities for Alarm Management

OSHA provides guidelines on management responsibilities in its publication, "Appendix C TO 1910.119 - Compliance Guidelines and Recommendations for Process Safety Management."

In addition, various engineering societies issue technical reports that affect process design. For example, the American Institute of Chemical Engineers has published technical reports on topics such as two-phase flow for venting devices. This type of technically recognized report would constitute good engineering practice.

Operating procedures addressing operating parameters will contain operating instructions about pressure limits, temperature ranges, flow rates, what to do when an upset condition occurs, what alarms and instruments are pertinent if an upset condition occurs, and other subjects.

Computerized process control systems add complexity to operating instructions. These operating instructions need to describe the logic of the software as well as the relationship between the equipment and the control system; otherwise, it may not be apparent to the operator.

Training in how to handle upset conditions must be accomplished as well as what operating personnel are to do in emergencies such as when a pump seal fails or a pipeline ruptures. More >>

Our interpretation: Organizations such as the Engineering Equipment & Materials User's Association (EEMUA) have defined recommended practices for alarm management, and ISA is coming out with its standard for alarm management. OSHA will soon compare your plants to these practices.

The National Safety Council offers the publication, "Do You Know What Is Really Critical?" by Dennis C. Hendershot. In general, a more reliable plant is a safer plant. Unplanned shutdowns, with equipment in modes of operation not anticipated by the designer, can create significant risks. And, data shows that for most continuous chemical plants, the highest risk phases of operation are startup and shutdown.

Our interpretation: An effective alarm system is important for production, safety and equipment damage. Alarms need to occur early enough, but not too early.


This illustration shows the relationship how alarms should define the
boundary between normal operation and abnormal operation.

The "Responsible Care Manufacturing Code of Practice," Section 4 – Operations, says that each company shall have written operating, engineering and maintenance procedures which specify conditions for the responsible operation of any facility during normal or abnormal circumstances. The company shall:

  • Perform and document a regular hazard analysis and risk assessment of the operating facility, and take action to minimize identified risk.
  • Have written and up-to-date procedures that cover all phases of operation, including start-up and shutdown.
  • Have written and up-to-date procedures that protect personnel during facility maintenance.
  • Take action to prevent injury, damage and harm to people and the environment.
  • Have a management system to control and record changes and modifications to equipment, processes, materials and associated computer hardware and software.
  • Institute security procedures and systems that protect the facilities and address possible security threats.
  • Maintain systems and procedures to minimize risks to safety, health and the environment during the handling and storage of all materials used and produced.
  • Audit and update these procedures on a regular basis.

Our interpretation: You need to conduct process hazard analyses, and have comprehensive procedures to address how the operator must respond to alarms.