FMEA – Failure Mode and Effects Analysis – A tool for predicting, analyzing and controlling Business Risks

What is FMEA?

An FMEA is an engineering analysis done by a cross-functional team of subject matter experts that thoroughly analyzes product designs or manufacturing processes early in the product development process. It is one of Six Sigma Tools. (For more on Six Sigma tools :

For more on Lean and Six Sigma :

cross-functional team is an organizational team consisting of members in the organization but serving in different areas.

An important point to note here is that FMEA is used in the Service Industry as well.

What are the various types of FMEA

The three most common types of FMEAs are:

  • System FMEA
  • Design FMEA
  • Process FMEA

System FMEA

  • Analysis is at highest-level of an entire system, made up of various subsystems.
  • The focus is on system-related deficiencies, including system safety and system integration
  • Interfaces between subsystems or with other systems
  • Interactions between subsystems or with the surrounding environment
  • Single-point failures (where a single component failure can result in complete failure of the entire system)
  • Example – ATM System of Bank , computer System

Design FMEA

  • Analysis is at the subsystem level (made up of various components) or component level.
  • The Focus is on product design-related deficiencies, with emphasis on
    • improving the design
    • ensuring product operation is safe and reliable during the useful life of the equipment.
    • interfaces between adjacent components.
  • Design FMEA usually assumes the product will be manufactured according to specifications.

Process FMEA

  • Analysis is at the manufacturing/assembly process level.
  • The Focus is on manufacturing related deficiencies, with emphasis on
    • Improving the manufacturing process
    • ensuring the product is built to design requirements in a safe manner, with minimal downtime, scrap and rework
    • manufacturing and assembly operations, shipping, incoming parts, transporting of materials, storage, conveyors, tool maintenance, and labeling
  • Process FMEAs most assume the design is sound

Why to conduct FMEA?

Failure Mode and Effects Analysis (FMEA) is a method designed to:

  • Identify and fully understand potential failure modes and their causes, and the effects of failure on the system or end users, for a given product or process
  • Assess the risk associated with the identified failure modes, effects and causes, and prioritize issues for corrective action
  • Identify and carry out corrective actions to address the most serious concerns


  • The primary objective of an FMEA is to improve the design
    • For System FMEAs, the objective is to improve the design of the system
    • For Design FMEAs, the objective is to improve the design of the subsystem or component
    • For Process FMEAs, the objective is to improve the design of the manufacturing process

There are many other objectives for doing FMEAs, such as:

  • identify and prevent safety hazards
  • minimize loss of product performance or performance degradation
  • improve test and verification plans (in the case of System or Design FMEAs)
  • improve Process Control Plans (in the case of Process FMEAs)
  • consider changes to the product design or manufacturing process
  • develop Preventive Maintenance plans for in-service machinery and equipment develop online diagnostic techniques
  • Prioritize actions

Key elements of FMEA

Failure Mode 

  • The term “failure mode” combines two words that both have unique meanings.
    • The Concise Oxford English Dictionary defines the word “failure” as the act of ceasing to function or the state of not functioning.
    • “Mode” is defined as a way in which something occurs


A “failure mode” is the manner in which the item or operation potentially fails to meet or deliver the intended function and associated requirements.

  • may include failure to perform a function within defined limits
  • inadequate or poor performance of the function
  • intermittent performance of a function
  • and/or performing an unintended or undesired function

Example: Monitor of computer system not powering on


An “effect” is the consequence of the failure on the system or end user.

  • This can be a single description of the effect on the top-level system and/or end user, or three levels of effects (local, next-higher level, and end effect)
  • For Process FMEAs, consider the effect at the manufacturing or assembly level, as well as at the system or end user.
  • There can be more than one effect for each failure mode. However, typically the FMEA team will use the most serious of the end effects for the analysis.

Example: Bicycle wheel does not slow down when the brake lever is pulled potentially resulting in accident


  • “Severity” is a ranking number associated with the most serious effect for a given failure mode
    • based on the criteria from a severity scale.
    • a relative ranking within the scope of the specific FMEA
    • determined without regard to the likelihood of occurrence or detection.


A “cause” is the specific reason for the failure, preferably found by asking “why” until the root cause is determined.

  • For Design FMEAs, the cause is the design deficiency that results in the failure mode.
  • For Process FMEAs, the cause is the manufacturing or assembly deficiency that results in the failure mode.
  • at the component level, cause should be taken to the level of failure mechanism.
  • if a cause occurs, the corresponding failure mode occurs.
  • There can be many causes for each failure mode.

Example: Power Cable is torn


“Occurrence” is a ranking number associated with the likelihood that the failure mode and its associated cause will be present in the item being analyzed.

  • For System and Design FMEAs, consider the likelihood of occurrence during the design life of the product.
  • For Process FMEAs consider the likelihood of occurrence during production.
  • based on the criteria from the corresponding occurrence scale.
  • has a relative meaning rather than absolute value, determined without regard to the severity or likelihood of detection.


“Controls” are the methods or actions currently planned, or are already in place, to reduce or eliminate the risk associated with each potential cause.

  • Controls can be the methods to prevent or detect the cause during product development, or actions to detect a problem during service before it becomes catastrophic.
  • There can be many controls for each cause.

    Preventive Controls

  • For System or Design FMEAs, prevention-type design controls describe how a cause, failure mode, or effect in the product design is prevented based on current or planned actions
    • they are intended to reduce the likelihood that the problem will occur, and are used as input to the occurrence ranking.
    • Example: Cable material selection based on ANSI #ABC. 

      Detection-type Controls

For System or Design FMEAs, detection-type designs controls describe how a failure mode or cause in the product design is detected, based on current or planned actions before the product design is released to production, and are used as input to the detection ranking.

  • They are intended to increase the likelihood that the problem will be detected before it reaches the end user.

Example: Bicycle system durability test # 719

  • “Detection” is a ranking number associated with the best control from the list of detection-type controls, based on the criteria from the detection scale.
    • considers the likelihood of detection of the failure mode/cause, according to defined criteria.
    • a relative ranking within the scope of the specific FMEA
    • determined without regard to the severity or likelihood of occurrence.

Risk Priority Number 

  • “RPN” is a numerical ranking of the risk of each potential failure mode/cause, made up of the arithmetic product(SXOXD) of the three elements:
    • severity of the effect (S)
    • likelihood of occurrence of the cause (O)
    • likelihood of detection of the cause (D)

For each Cause, there will be one RPN number

FMEA elements
This illustration is from the book Effective FMEAs, by Carl S. Carlson, published by John Wiley & Sons, © 2012

Key points about FMEA : 

  • Severity is the most severe effect of the failure mode
  • Occurrence is related to the cause of the failure mode and not the failure mode
  • Detection is associated with detection controls of causes
  • If preventive control is present, ranking of detection control is one(1)
  • Severity of a failure mode can be reduced by changing the design or eliminating the failure mode.
  • The ultimate goal of FMEA is to find potential business risks and prioritization of actions for reducing the risks This is done by using the RPN approach
  • Each failure mode will have one severity , but it can have many causes. 
  • Controls are to be put on the causes.
  • Causes form the heart of FMEA
  • To find failure mode, use structured approach and not brainstorming.  A failure mode is derived from :
    • No Function
    • Degradation : When the function fails over time
    • Intermittent : sometimes work and sometime does not
    • Partial : does not work fully
    • Unintended : acts in a surprising manner
    • Over –function : does more than intended

    There can be 6 or more failures of each function 

  • Automotive Industry has to use AIAG guidelines for conducting FMEA
  • Having a thresh-hold value for RPN (beyond which you take action) is not recommended. This is because , people tend to have a tendency to keep everything  below the thresh-hold and no real improvement happens. AIAG recommends to keep top 10 items for RPN reduction in case of Process FMEA. However, a severity of 9 or 10 have to be actioned first irrespective of RPN number.
  • When you work on PFMEA, it is assumed that the equipment will work as per their design intent. We don’t discuss / consider design related issues while conducting PFMEA. 
  • When conducting DFMEA, assume that the process will function as designed. Don’t bring up process-related issues in DFMEA
  • A FMEA must be done by a Core Team
  • Rating scale for Severity, Occurrence is usually from 1 to 10, some people also use rating from 1 to 5.
  • If you choose a scale of 1 to 10, a severity of 1 indicates zero or minimal negative effect and 9 and 10 are most severe.
  • Occurrence of 1 means very low occurrence or no chance , 9 or 10 means very high occurrence.
  • Detection rating works in reverse. A preventive / poke yoke implies a rating of  1 and 10 means there is no chance of detection. (better detection, lower number)
  • The thumb rule for action is basis RPN number and severity number. If severity is 9 or 10; take action irrespective of RPN number. Next look for RPN reduction by employing some detection technique , followed by taking some action to eliminate some of the causes. 
  • Always remember that severity is most difficult to reduce (can be done by design change only) ; Our first bet is to have some detection system in place , followed by eliminating causes. This will lead to RPN or risk reduction, which we are aiming through FMEA
  • There are occasions when we can’t do much with high severity ranking of 9 or 10, we have to live with it. This may be due to technological or financial constraints.

  (Inputs on Key-points have been contributed by many Business Excellence Professionals,it is difficult to name them all but I am thankful to all of them. These points emerged out of an online discussion and I am privileged to summarize those points)

5 Replies to “FMEA – Failure Mode and Effects Analysis – A tool for predicting, analyzing and controlling Business Risks”

Leave a Reply

Your email address will not be published. Required fields are marked *

Sign up to our newsletter!