FMEA (Failure Modes and Effects Analysis)
Many companies use FMEA as a central pillar of their design process. FMEA provides a structured approach to the analysis of route causes (of failure), the estimation of severity or impact, and the effectiveness of strategies for prevention. The ultimate output is the generation of action plans to prevent, detect or reduce the impact of potential modes of failure. In a nutshell, it encourages the design team to consider:
- What could wrong
- How badly it might go wrong
- What needs to be done to prevent or mitigate the problem
FMEA emerged from the US Military in the late 1940s as a tool to improve the evaluation of reliability of equipment. Its benefits quickly became apparent and it was adopted by aerospace industries and NASA during the Apollo programme in the 1960s. It was later taken up by many of the larger automotive companies, including Ford in the 1970s. It has since become a core tool in product development in many organisations and is recommended as a part of an organisation's quality management system.
The basic logic can be applied at a number of levels, including organisational issues, strategy issues, product design issues, production processes and individual components. Typically, it is used to analyse either a product design or production process:
Product or Design FMEA
What could go wrong with a product while in service as a result of a weakness in design?
- Carried out during the early stages of a design project
- Tends to assume that the product will be produced to the required design specifications
- Aims to reduce reliance on process controls and inspection to overcome limitations in the basic design and thus, need to consider the technical and physical limitations of the manufacturing and assembly processes
What could go wrong with a product during manufacture or while in service as a result of non-compliance to specification or design?
Typically, the information is collated and presented in a tabular format, as shown below:
1. Level of analysis
The analysis can be carried out at a project, product, system, subsystem or component level. It is important to be clear about the level at which the current analysis is taking place. A hierarchical organisation of analysis enables the design team to drill down to detail where appropriate.
2. Date & prepared by
To record who was involved and when the analysis took place.
3. FMEA number & reference information
Clear numbering is important, to enable the team to trace an analysis from system to component level. It may also be important to reference any important test results, documents or drawings here.
4. System / component / function
The specific name / number of the element or issues under study.
5. Potential Failure Modes
The manner in which a component, subsystem or system could possibly fail while being used. Here the design team must be creative in seeking ideas for all potential modes of failure. Ask open and general questions: How can it fail? Under what conditions? What types of use? etc.
6. Potential Effects of Failure
For each mode of failure, what will the likely effect be? How would the failure affect different stakeholders? What will be the likely outcomes if the system or component fails? Provide as detailed description as is necessary of the potential impact of failure. An individual failure mode may have many possible effects.
7. Severity rating
Each failure effect can be judged for it's potential seriousness. Typically, this is done by scoring the effect on a 1 to 5 (or 10) scale. This value should be discussed and negotiated by all members of the team. A team may wish to define for itself the severity to go with each score, below is a suggested scheme:
5 (9-10) With potential safety risk or legal problems - potential loss of life or major dissatisfaction
4 (7-8) High potential customer dissatisfaction - serious injury or significant mission disruption
3 (5-6) Medium potential customer dissatisfaction - potential small injury, mission inconvenience / delay
2 (3-4) The customer may notice the potential failure and may be a little dissatisfied - annoyance
1 (1-2) The customer will probably not detect the failure - undetectable
A column is provided to enable the rapid identification of potentially critical failures which must be addressed (e.g. safety issues, sales issues etc.)
9. Potential Cause / Mechanisms of Failure
Each failure mode will have an underlying root cause. Thus, it is important to spend time to establish the potential root causes or mechanisms of failure, by asking ' what is the likely cause of the failure mode? ' Possible causes could include: Wrong tolerances, poor alignment, operator error, component missing, fatigue, defective components, maintenance required, environment etc.
10. Occurrence Ranking
It is also necessary to consider the likelihood of the potential failure occurring. Here, a 'probability' assessment is made by the team and scored on a 1 to 5 (or 10) scale. Possible occurrence ratings (you can define them in other ways) are shown below:
5 (9-10) Very high probability of occurrence
4 (7-8) High probability of occurrence
3 (5-6) Moderate probability of occurrence
2 (3-4) Low probability of occurrence
1 (1-2) Remote probability of occcurence
This section is critical in the FMEA procedure and each of the responses categorised as very high or high should be considered and addressed.
11. Current design controls
Are there any design controls which aim to reduce or eliminate the potential failure? These could include labels, barriers, instructions or total redesigns. Other controls could include prototyping, evaluation or possibly market surveys.
12. Detection rating
The final rating aims to establish how 'detectable' the potential fault will be. Will it be instantly noticeable or will it not be apparent. In addition, how likely is it that the controls listed will enable the detection of the potential failure? Suggested ratings on a scale of 1 to 5 (or 10):
5 (9 or 10) Zero probability of detecting the potential failure cause
4 (7 or 8) Close to zero probability of detecting potential failure cause
3 (4, 5 or 6) Not likely to detect potential failure cause
2 (2 or 3) Good chance of detecting potential failure cause
1 (1) Almost certain to identify potential failure cause
If the FMEA is being carried out at a 'project' level, then it can be beneficial to consider this value as 'reactability'. Will it be possible to react to the failure rapidly enough to reduce its impact sufficiently?
13. Risk Priority Number (RPN)
It is likely that the team will have identified many possible failure modes and effects. Each one needs to be assigned a 'Risk Priority Number' to enable the prioritisation of mitigating action. The RPN is simply the product of the severity, occurrence and detection ratings:
RPN = Severity rating x Occurrence rating x Detection rating
- perhaps more easily remembered as:
The RPN value gives an indicator of the design risk and generally, the items with the highest RPN and severity ratings should be given first consideration.
14. Recommended actions
Follow up is essential and actions to reduce the impact or likelihood are essential These actions should be specific and preferably measurable. Attention should be given to actions that address the root cause and not the symptoms.
Finally, all actions should be clearly allocated (to an individual, department and/or organisation) and a clear deadline given.
16. Additional columns if wanted:
Some FMEA users add additional columns to record the actual actions taken or keep an update on the status of actions. It can also be a good idea to revise the RPN value following the corrective action. This enables full trace-ability between potential problems and the outcomes of actions.
For more information, please contact:
T: +44 1223 764830