. Joel B. Christian, P.E., Ph.D., Professional Engineer, Towanda, Bradford County, PA, USA
joelbchristian.com
banner
Joel B Christian P.E. registered professional engineer

Chemical Environmental Materials Engineering

What is the probability of human error?

The probability of human error is a little-discussed topic in industrial safety circles. First, most safety professionals seem to be focused on immediate problems, such as OSHA compliance, rather than long-term loss prevention. Interestingly, in my experience, military analysts are more likely to have this type of intellectual discussion.

So here is the deep, dark secret: the probability of human error is roughly 1/1000. This means that the average person will make one mistake every 1000 times they do something like make a decision or perform an operation. Normal decisions or tasks performed routinely.

OK, I've seen numbers as low as 1/1500 for this, perhaps 1/2000; but 1 in a thousand is roughly the number for routine human errors. Honestly, did you know that? Not good odds!

Now, I'm not talking about an "epic fail", I'm just talking about a simple error. Flip the wrong switch. Flip the switch the wrong way. Open the wrong valve. Turn the alarm clock off instead of snooze. Hit "OK" instead of "Cancel", judge a color off, enter the wrong number on a checksheet. So the textbook says the probability is 10-3. That seems very improbable. Hold it - isn't that math code for one in a thousand?

So this discussion seems boring and trivial to most of the exciting people I talk to, but just add "nuclear" to the discussion and interest will perk up. Hovever, this could just as easily be "de-energise the equipment", or "nitrogen purge the explosive gas".

nuts

OK, so you tell me "I'll remind everyone to be careful", that will fix it, right? No, actually, the 1/1000 rate IS being careful. This is the normal error rate for routine tasks, whether you're thinking about being careful, the number of safe hours on your shirt, the poster on the wall, or what's for lunch. This rate is a real number. Almost a universal number. When desigining for safety, this number must be in the design.

How can we make a step safer? There are two basic strategies, increase training or add complexity to the control system, such as interlocks or two-step systems. For example, Amazon.com lets you buy with "one-click" or add to cart and then purchase. Turning off one-click lets you check the items in the cart, which card is getting charged, and where it is getting shipped to. Ever buy with one-click and have one of those things not the way you wanted it?

Training, drills, running simulators, and other activities can increase the odds to 1/5000, perhaps as high in 1/20,000. But this requires a LOT of work. A second strategy is to require two different people approve a task, in theory this takes two 1/1000 events to {1/(1000 x 1/1000} or one in a million, but in reality, these events are still coupled, making the odds much higher.

The larger problem can be that one error is not detected. I'm a musician. In music, it is very easy to hear a wrong note. Everyone hears it. But sometimes the wrong rhythm to the right note is harder to detect. Everything seems right, but it still can be wrong, but undetected. If the task is part of a larger multi-part task, where each step requires accuracy, and one may not see the effect of an error until further down the line.

There is no easy prevention for human error. All these scenarios need to be considered in setting up processes that require human interaction. And please - don't think computers are any better! That's a whole 'nother topic!

So how can the effects of this natural human error rate be designed into a system? Generally it requires a multi-step program: Wait, isnt the mathematical analysis part a waste of time? Absolutely not. First, you can't demonstrate to an examiner that you even thought about what could go wrong without an analysis of the process and potential failure points. What good is training if you haven't identified the few most dangerous potential errors? Now your HAZOP or other Process Safety Management tool may have identified the steps that could go wrong and make recommendations, the HAZOP still uses qualitative measures. Using quantitative measures such as Bayesian probabilities adds a DESIGN element to safe operation of a process. Think about it. Even if you use the wrong probability for human error, you have still arrived at an overall number for the probability of safe operation. Your mathematical model can be improved over time. Maybe decisions at one point are 1/1000 and in another area tend to run 3/1000. Isn't a fundamental basis of engineering to reduce a problem to mathematical first principles and to be able to build a universal solution from them?

Image of Figure 1 (1)

Mathematical analysis is is generally done with input and outcome trees or networks. These are quivalent systems, some prefer to use logic trees and some prefer an open grid/network framework. A tree or network of all possible scenarios is constructed, including all decision points and hardware failure modes. Then the probability for each tree branch or network node to occur are computed. Network methods are better suited for software implementation, but trees are easier for newbies to understand. Once your system is understood, it helps with the next step, which often is using the layers of protection analysis (LOPA) methodology. These help to understand and add layers of independent protection to a process or event scenario.

Process and Safety Engineers are often asked to do the minimum required for compliance with code or standards. Often the engineer must evaluate the current process and analysis of that process and ask for more than the minimum, ultimately to protect their client, employer, workers, and the public. These can be difficult requests to make, but knowing the tools involved can help to explain the level of review that is appropriate for both existing and new processes.



See my published paper on combining
fault and event trees for process safety evaluation.

(1) Christian, Joel B., “Combine Fault and Event Trees for Safety Analysis” Chemical Engineering Progress, 93(4) (April 1997), p72-75. (HTML)

The probability of 1/1000 is in many sources, one being: collected data reported in Table 4 & Table 5, Page 849, "Evaluation of Human Work, 3rd Edition", Eds John R. Wilson, and Nigel Corlett. CRC Press (2005).


Back to homepage.



Contact Info


Joel B. Christian, P.E., Ph.D.
Towanda, Pennsylvania, USA 18848
email: contact@joelbchristian.com

phone: contact phone number


Back to homepage: joelbchristian.com