Happy’s Essential Skills: Failure Modes and Effects Analysis (FMEA)
What is FMEA?
Failure modes and effects analysis (FMEA) is a systematic process to evaluate failure modes and causes associated with the design and manufacturing processes of a new product. It is somewhat similar to the potential problem analysis (PPA) phase of the Kepner-Tregoe program. Here is a list of activities for a FMEA:
1. Determine potential failure modes of each component or subassembly and causes associated with the designing and manufacturing of a product.
2. Identify actions which could be eliminate or reduce the chance of a potential failure occurring.
3. Document the process and give each mode a numeric rating for frequency of occurrence, criticality, and probability of detection.
4. Multiply these three numbers together to obtain the risk priority number (RPN), which is used to guide the design effort to the most critical problems first.
Two aspects of FMEA are particularly important: a team approach and timeliness. The team approach is vital because the broader the expertise that is brought to bear on making and assigning values to the failure mode list, the more effective the FMEA will be.
Timeliness is important because FMEA is primarily a preventive tool, which can help steer design decisions between alternatives before failure modes are designed-in, rather than redesigning after the failure occurs. FMEA is equally applicable to hardware or software, to components or systems.
Comparison to FTA
Another similar process is fault tree analysis (FTA). While FMEA is a bottom-up approach, FTA is top-down. FTA starts with the assumption of a system failure mode, and then works down through the system block diagram to look for possible causes of that mode.
Thus, FTA requires fairly complete, detailed information about the system, and is most effective after the system is well-defined. (FTA could be performed, in a limited way, on alternative system concepts; this could be used to help decide the best of several alternatives.) A separate FTA must be performed for each system failure mode.
FTA and FMEA are complimentary. Whenever possible, both should be used. For practical reasons, FTA should be limited to the really serious system-level failure modes, such as those involving safety or permanent system damage. FMEA can be used at the component, subassembly, and module level, to help optimize those modules. There are excellent discussions and examples of FTA in References 2 and 4, and it will not be discussed further in this column.
Benefits of FMEA
The RPN calculated by FMEA allows prioritization of the failure mode list, guiding design effort to the most critical areas first. It also provides a documentary record of the failure prevention efforts of the design team, which is helpful to management in gauging the quality and extent of the effort, to production in solving problems which occur despite these efforts, and to future projects which can benefit from all the work and thinking that went into the failure mode and cause lists.
Eliminating potential failure modes has both short term and long term benefits. The short term benefit is most often recognized because it represents savings of the costs of repair, retest, and downtime, which are objectively accountable. The long term benefit is much more difficult to measure, since it relates to the customer satisfaction or dissatisfaction with the product, and perception of its quality.
FMEA supports the design process by:
- Aiding in the objective evaluation of alternatives during design
- Increasing the probability that potential failure modes and their effects on system operation have been considered during design
- Providing additional information to aid in the planning of thorough and efficient test programs
- Developing a list of potential failure modes ranked according to their probable effect on the customer, thus establishing a priority system for design and test
- Providing an open, documented format for recommending and tracking risk-reducing actions
- Identifying known and potential failure modes which might otherwise be overlooked
- Exposing and documenting the ways a system can fail, and the effects of such failures
- Detecting primary but often minor failures which may cause serious secondary failures or consequent damage
- Detecting areas where "fail safe" or "fail soft" features are needed
- Providing a fresh viewpoint in understanding a system's functions
The uses of a FMEA report include:
- A formal record of the safety and reliability analysis and planning, to satisfy customers or regulatory agencies
- Evidence in litigation involving safety or reliability
- Design of diagnostic routines or built-in tests
- A basis for creating trouble-shooting procedures
- A means to consider and prevent manufacturing defects
- Problem follow-up and corrective action tracking
- A future reference to aid in analyzing field failures, evaluating design changes, or developing improved designs
In simple terms, the FMEA process attempts to list every failure mode of each component of a system, and to predict its effect on system operation. Failure effects can be considered at more than one level (e.g., effects at the subsystem or overall system levels).
FMEA can be accomplished using either a component or functional approach. In the component approach, actual failure modes are listed (e.g., resistor open, bearing seizure). The functional approach is used when the details of the design are not yet fully defined. In this approach, function failures are considered (e.g., no feedback, memory lost). FMEA can also be performed using a combination of component and functional approaches. The failure mode is the symptom of the failure, as distinct from the cause of the failure, which consists of the proved reasons for the existence of the symptoms. Reliability aspects of the components must be considered in this process. FMEA requires inputs from hardware, software, systems, customer service, and manufacturing in assessing component failures, effects at higher levels of the system, fault detection, and in evaluating failure compensating provisions in the design.
FMEA should be performed by a team of people having broad knowledge of the system's design and application. All available information on the design should be obtained: external and internal reference specifications, schematics, computer-aided design (CAD) data, stress analysis, reliability prediction data, test results, etc.
A system functional block diagram and reliability block diagram (Figure 1) should be prepared, as these are important for preparing the FMEA, and for understanding the completed analysis. All possible operating modes of the system and their functional relationships should be considered in the analysis. If the system has redundancy, it should also be considered by evaluating the effects of failure modes assuming the redundant system may or may not function.
Figure 1: Reliability block diagram examples.
FMEA can be performed from different points of view, such as failure detectability, safety, repair cost, availability, etc. The customer's point of view should always be first, in order to improve customer satisfaction with the final product. The viewpoint being considered should be consistent throughout a particular analysis, in order to assign proper criticality values. A hierarchical approach works well with systems, using a component approach at the lowest assembly level, and a functional approach to combine the effects of various subsystems. This also allows the FMEA to be performed even when some subsystems are not yet completely designed.
If CAD has been used on parts of the system, the FMEA can utilize that capability to simulate the effects of various failure modes. Such an analysis can greatly enhance the accuracy and objectivity of the FMEA process.
FMEA is an ongoing process which should be started as soon as initial design information is available. It should be updated and enhanced as the design evolves, so that the analysis can be used to improve the design. All possible design alternatives should be analyzed separately, so that the effect on system performance and reliability can be considered in deciding which option to implement. Test results should be used later to update the analysis.
Requirements to Perform FMEA
- A team of people with a commitment to improve the ability of the design to meet the customer's needs
- Schematics and block diagrams of each level of the system, from subassemblies to complete system
- Component specifications, parts lists and design data
- Functional specifications of modules, subassemblies, etc.
- Design manuals
- Manufacturing requirements and details of the processes to be used
- FMEA forms and a list of any special considerations, such as safety or regulatory, that are applicable to this product
Figure 2: FMEA flow chart.
Steps in Performing FMEA
- Discuss and define system functional requirements (scope), including all modes of operation (list in order of decreasing importance). Is it for concept, system, design, process, product or service and customer needs?
- Develop a functional block diagram and a reliability block diagram (Figure 2) for each subassembly being analyzed.
- Define parameters and functions of each functional block required for successful operation of the system.
- Using the FMEA forms to document the further steps, identify potential failure modes for each of the functional blocks.
- Analyze system or subassembly functions affected by factors such as those in the list of FMEA considerations.
- Identify all possible causes for each failure mode of the functional block being analyzed. The causes must be detailed to the component level wherever possible. These are potential failure modes. If necessary, go back and rewrite the function with more detail to be sure the failure modes show a loss of that function.
- Identify all possible ways the failure modes could affect the functions of the higher level assemblies.
- Assign the frequency, criticality, and detection values for each failure mode. (Tables 1– 3)
- Obtain the RPN by multiplying the three values assigned in step 8. This priority number will allow us to focus on the most important failure modes first.
- Determine all the possible root causes and corrective actions for each failure mode, and update the design status as it progresses.
- Summarize the failure modes and corrective actions in order of decreasing RPN.
- Focus on eliminating at least the 50% of the failure modes with the highest RPN.
An example of a FMEA analysis is shown in Figure 3.
When analyzing system or subassembly functions affected by factors, consider this list:
When the failure modes have been rank ordered by RPN, corrective action should be first directed at the highest ranked concerns and critical items. If a recommended action might be a design of experiments (Plackett-Berman or Taguchi Method). The intent of any recommended action is to reduce the occurrence, severity and/or detection ranking.
What is Process FMEA (P-FMEA)?
A process potential FMEA (P-FMEA) is an analytical technique utilized by manufacturing/process engineers as a means to assure that, to the extent possible, potential concerns have been considered and addressed. In its most rigorous form, a P-FMEA is a summary of the engineer’s thoughts (including an analysis of items that could go wrong based on experience and past concerns) as a process is developed. This systematic approach parallels and formalizes the mental discipline that an engineer normally goes through in any manufacturing planning process.
The P-FMEA identifies potential product-related process failure modes, assesses the potential down-stream effects of the failures, identifies the potential manufacturing or assembly process causes, and identifies failure modes ranking according to their effect on the customer, thus establishing a priority system for corrective action considerations. The P-FMEA also documents the results of the manufacturing or assembly process.
Here is a list of activities for a P-FMEA:
1. Determine potential concerns that might cause failure modes for each process or subassembly and causes associated with the designing and manufacturing of a product
2. Identify actions which could eliminate or reduce the chance of a potential process failure occurring
3. Document the process and give each mode a numeric rating for frequency of occurrence, criticality, and probability of detection. Finally, multiply these three numbers together to obtain the RPN, which is used to guide the design effort to the most critical problems first.
Two aspects of P-FMEA are particularly important: a team approach, and timeliness. The team approach is vital because the broader the expertise that is brought to bear on making and assigning values to the failure mode list, the more effective the P-FMEA will be.
Timeliness is important because P-FMEA is primarily a preventive tool, which can help steer manufacturing development decisions between alternatives before failure modes are built-in, rather than reworking after the failure occurs. P-FMEA is equally applicable to hardware or software.
Steps in Performing P-FMEA
1. Discuss and define manufacturing and process functional requirements, including all modes of operation (list in order of decreasing importance).
2. Develop a functional block diagram and a reliability block diagram (Figure 2) for each manufacturing/process being analyzed.
3. Define parameters and functions of each functional block required for successful operation of the process.
4. Using the FMEA forms to document the further steps, identify potential failure modes for each of the functional blocks.
5. Analyze process or subassembly functions affected by factors such as those in the list of FMEA considerations.
6. Identify all possible causes for each failure mode of the functional block being analyzed. The causes must be detailed to the component level wherever possible.
7. Identify all possible ways the failure modes could affect the functions of the higher level manufacturing steps or processes or assemblies.
8. Assign the frequency, criticality, and detection values for each failure mode. (Tables 1–3).
9. Obtain the RPN by multiplying the three values assigned in step 8. This priority number will allow us to focus on the most important failure modes first.
10. Determine corrective actions for each failure mode, and update the manufacturing documentation as it progresses.
11. Summarize the failure modes and corrective actions in order of decreasing RPN.
12. Focus on eliminating at least the 50% of the failure modes with the highest RPN.
FMEA and P-FMEA are continuing processes that should be initiated at investigation/launch time of the design cycle and then be regularly updated as changes occur throughout the phases of product development. FMEA must be completed before the design is frozen, for its purpose is to affect the design and harden it against failures due to causes that can be anticipated. Potential manufacturing or assembly concerns known by the design engineer should be conveyed to the production engineers, using means such as the P-FMEA team meetings.
1. Procedures for Performing FM ECA, MIL-ST D-162, available from National Technical Information Service, Springfield VA 22161.
2. O'Connor, P.D.T., Practical Reliability Engineering, 2nd Edition, John Wiley & Sons, 1985. ISBN 0-471-90551-8.
3. Potential Failure Modes and Effects Analysis, Ford Motor Co., Sept. 1988. Available from FORD, Electronics Division, P.O. Box 6010, Dearborn, MI 48121-6010. Attn: Supplier Quality Manager.
4. Ireson & Coombs, Handbook of Reliability Engineering and Management, McGraw-Hill, 1988. ISBN 0-07-032039-X.
- Reliability in Product Design and Testing, American Supplier Institute, 6 Parkland Blvd. Suite 411, Dearborn MI 48126. Contact: (313) 336-8877.
- Applied Reliability Engineering and Product Assurance for Engineers and Managers, Univ. of Maryland, Center for Professional Development, University Blvd. at Adelphi Rd., College Park MD 20742-1668. Contact: (301) 985-7157.
- Reliability Training Program (videotape seminar), Technicomp Inc., 1111 Chester Avenue, 300 Park Plaza, Cleveland OH 44114. Contact: (216) 687-1122.
- PC FMECA (IBM PC Program), Management Sciences Inc., 6022 Constitution Ave. NE, Albuquerque, NM 87110. Contact: (505) 255-8611.
- TreeMaster (IBM PC Program), Management Sciences Inc., 6022 Constitution Ave. NE, Albuquerque, NM 87110. Contact: (505) 255-8611.
- Results (FTA Program), Management Sciences Inc., 6022 Constitution Ave. NE, Albuquerque, NM 87110. Contact: (505) 255-861.
Happy Holden has worked in printed circuit technology since 1970, with Hewlett-Packard, NanYa/Westwood, Merix, Foxconn and Gentex. He is the co-editor, with Clyde Coombs, of the Printed Circuit Handbook, 7th Ed. To contact Holden, click here.