As mentioned in the introduction to the AIAG/VDA aligned standard (“Vol. V: Alignment”), the new FMEA Handbook, is a significant expansion of its predecessors. A substantial portion of this expansion is the introduction of a new FMEA type – the Supplemental FMEA for Monitoring and System Response (FMEA-MSR).
Modern vehicles contain a plethora of onboard diagnostic tools and driver aids. The FMEA-MSR is conducted to evaluate these tools for their ability to prevent or mitigate Effects of Failure during vehicle operation.
Discussion of FMEA-MSR is devoid of comparisons to classical FMEA, as it has no correlate in that method. In this installment of the “FMEA” series, the new analysis will be presented in similar fashion to the previous aligned FMEA types. Understanding the aligned Design FMEA method is critical to successful implementation of FMEA-MSR; this presentation assumes the reader has attained sufficient competency in DFMEA. Even so, review of aligned DFMEA (Vol. VI) is highly recommended prior to pursuing FMEA-MSR.
Following the precedent of DFMEA and PFMEA, the “Standard FMEA-MSR Form Sheet” (“Form H” in Appendix A of the AIAG/VDA FMEA Handbook) is color-coded, but otherwise deviates significantly. As shown in Exhibit 1, the FMEA-MSR form contains information from DFMEA (Steps 1 – 6, in the upper portion of the stacked format). The information needed for FMEA-MSR may not be readily available in the original DFMEA, however, as the control systems may have been deemed out of scope. Therefore, the structure tree and original analysis may need to be expanded prior to initiating the new method of analysis. The lower portion of the stacked format, containing information from the Supplemental FMEA (Steps 5 – 6) represent this new method. This format reinforces that FMEA-MSR is a supplemental analysis based on DFMEA.
Like previous installments in this series, portions of the form are shown in close-up, with reference bubbles correlating to discussion in the text. The goal is to facilitate learning this new analysis method by maintaining the links to the 7-step approach, DFMEA, form sections, and individual columns where information is recorded.
Conducting a Supplemental FMEA for Monitoring and System Response
Supplemental Failure Modes and Effects Analysis for Monitoring and System Response is conducted in the same three stages and seven steps followed to complete a DFMEA or PFMEA. A pictorial representation of the Seven Step Approach is reproduced in Exhibit 2. Though there are few hints exclusive to FMEA-MSR, this summary diagram should be referenced as one proceeds through this presentation and while conducting an FMEA-MSR to ensure effective analysis.
1st Step – Planning & Preparation
Developing a project plan, based on the Five Ts (see Vol. V), is once again recommended to provide structure for the analysis. The context limitations (new or modified design or application) remain, as do the levels of analysis. Supplemental analysis differs from DFMEA insomuch that FMEA-MSR is focused on vehicle systems that serve diagnostic, notification, and intervention functions during vehicle operation. These systems are often referenced in the Risk Analysis (Step 5) section of the DFMEA form.
To assist in scope definition, the Handbook proffers three basic questions; affirmative responses indicate the subject failure chain is appropriately in the scope of analysis.
1) “After completing a DFMEA on an Electrical/Electronic/Programmable Electronic System, are there effects that may be harmful to persons or involve regulatory noncompliance?” Such Effects are typically assigned the highest Severity scores (S = 9 – 10).
2) “Did the DFMEA indicate that all of the causes which lead to harm or noncompliance can be detected by direct sensing and/or plausibility algorithms?” This suggests the ideal Detection score (D = 1).
3) “Did the DFMEA indicate that the intended system response to any and all of the detected causes is to switch to a degraded operational state (including disabling the vehicle), inform the driver and/or write a Diagnostic Trouble Code (DTC) into the control unit for service purposes?” The answer to this question is typically found in the Current Controls portion of the DFMEA, as mentioned above.
Deriving the Supplemental analysis from the DFMEA suggests that the core team remains the same; in particular, Design Responsibility should remain with the same individual. The extended team members require expertise in diagnostic systems and the vehicle functions they are used to monitor.
Refer to “Vol. VI: ‘Aligned’ DFMEA” and “Vol. III: ‘Classical’ DFMEA” for more information on the “Planning and Preparation (Step 1)” section of the FMEA-MSR form.
The “History/Change Authorization” column is carried over from DFMEA. The manner of use in prior analyses should be continued per established organizational norms.
2nd Step – Structure Analysis
In the Structure Analysis step, the system or element named in Step 1 is defined using a structure tree or block diagram. Only those elements, identified in the DFMEA, with Causes of Failure that show potential for harm to persons or regulatory noncompliance (see question 1, Step 1, above) are in scope. Analysis can be performed at vehicle level (OEM) or at a subsystem level (supplier).
System interfaces may also be in scope. When a control unit is in scope, but a sensor from which it receives a signal, or actuator to which it transmits a signal, is not, the connector that allows the device to communicate with the control unit (i.e. interface) should be included in the analysis. The signal status (intermittent, lost, incorrect, etc.) can then be considered in analysis of the control unit and the connected device via this interface. An example of this type of interface analysis is depicted in Exhibit 3, where an input/output connector links two structure trees.
Contrary to the “assume all inputs are acceptable” philosophy of previous FMEAs, erroneous signals from elements outside the scope of responsibility of the design team should be included in the analysis. The objective is to ensure safety and compliance, even when system inputs are corrupted.
The Structure Analysis (Step 2) section of the FMEA-MSR form is a subset of the completed DFMEA. Information can be transferred from the DFMEA and structure tree previously created or expanded to include diagnostic elements.
3rd Step – Function Analysis
Once again drawing from the structure tree and DFMEA, the Function Analysis (Step 3) section of the FMEA-MSR form can be completed from these sources, as shown in Exhibit 4. A P-Diagram could also be used to organize and visualize the diagnostic system elements. Functions of interest include those associated with failure detection and response. Elements may include hardware, software, and communication signals; interface elements may be included when the connected device is out of scope.
Failure Analysis and Risk Mitigation
4th Step – Failure Analysis
For purposes of analysis, the diagnostic systems are assumed to function as intended. That is, FMEA-MSR is conducted to evaluate the sufficiency of designed system responses, not the system’s reliability. Causes of Failures of diagnostic systems (undetected faults, false alarms, etc.) can be included in the DFMEA, however.
Supplemental FMEA-MSR requires consideration of three failure scenarios that differ in the occurrence of a hazardous event, detection of a failure, and effectiveness of failure response. The first scenario involves a fault condition that causes malfunctioning behavior (Failure Mode) resulting in Effects of Failure, but no hazardous event. No detection or response occurs, though a noncompliant condition may be created. This scenario is depicted in Exhibit 5.
The second scenario differs from the first in that a hazardous event occurs, as depicted in Exhibit 6. Again, no detection or response occurs; this may be due to a slow or inoperable monitoring or intervention mechanism. The time available for a system response to prevent a hazardous event is the Fault Handling Time Interval – the time elapsed between occurrence of a fault and a resultant hazardous event. The minimum Fault Handling Time Interval is the Fault Tolerant Time Interval – the time elapsed between occurrence of a fault and Effect of Failure. In Scenario 2, the monitoring system in place, if any, is insufficient to prevent the hazardous event.
The third scenario incorporates fault detection and system response to mitigate Effects of Failure and prevent occurrence of a hazardous event. In this scenario, depicted in Exhibit 7, the user experiences a partial loss or degradation of function (mitigated Effect of Failure) in lieu of the hazardous event. System response time must be less than the Fault Tolerant Time Interval to ensure consistent performance in service.
The final failure scenario can also be represented by the “hybrid failure chain” depicted in Exhibit 8. The hybrid failure chain adds a chain parallel to the DFMEA failure chain; it includes monitoring for Causes of Failure, the response activated when a fault is detected, and the substitute, or mitigated, Effects of Failure. In the Monitoring and System Response (MSR) scenario, analysis is shifted from the DFMEA failure chain to the MSR failure chain (lower portion of the stacked-format FMEA-MSR form). An example hybrid failure chain, in structure tree format, is shown in Exhibit 9. This includes a description of intended behavior (Failure Mode) and mitigated Effects.
An Effect of Failure in FMEA-MSR can be a malfunctioning behavior, as in DFMEA; it can also be an intended behavior or designed response to detection of a Cause of Failure. If the designed response (“intervention”) is effective, a “safe state” will result, albeit with some loss of function. In Scenarios 1 and 2, the Failure Mode remains that in the DFMEA. This is the failure chain recorded in the Failure Analysis (Step 4) section of the FMEA-MSR form, as shown in Exhibit 10.
To complete the Failure Analysis section, assign a Severity (S) score to the Effect of Failure. The same Severity criteria table is used in DFMEA and FMEA-MSR; it is reproduced in Exhibit 11. Refer to Vol. VI for further guidance on assigning Severity scores and use of the criteria table.
5th Step – Risk Analysis
Risk Analysis is the first step for which the lower (“new”) section of the FMEA-MSR is used. Additional information is needed in the DFMEA (upper) section, however, creating two branches of analysis. In DFMEA, identify the Prevention Controls that allow the diagnostic system to reliably detect Causes of Failure and initiate the system response within the Fault Tolerant Time Interval. Also, identify the Detection Controls in use to verify the diagnostic system’s effectiveness; simulation and other development tests are often cited.
Thus far, the discussion has focused on conducting DFMEA in a slightly different way than previously presented. However, the remainder of the DFMEA section is completed as described in “Vol. VI: ‘Aligned’ DFMEA.” The remainder of this presentation, therefore, can focus on the unique aspects of the Supplemental FMEA for Monitoring and System Response and completing the FMEA-MSR form.
The Frequency (F) rating is comparable to the Occurrence score in DFMEA and PFMEA. It is an estimate of the frequency of occurrence of the Cause of Failure in applicable operating conditions during the vehicle’s planned service life. In the Frequency criteria table, shown in Exhibit 13, select the statement that best describes the likelihood of the Cause of Failure occurring; record the corresponding F rating in column B of the FMEA-MSR form (Exhibit 12).
Note that the qualifier emphasized above – in applicable operating conditions – may be invoked to adjust the F rating. The conditions in which this can be done are outlined in the note below the F table in Exhibit 13. Stated generally, the Frequency rating may be reduced if the conditions in which the Cause of Failure that leads to the associated Effect of Failure exists only during a small portion of the time the vehicle is in operation.
If the F rating is reduced for this reason, the relevant operating conditions are recorded in column A of the form – “Rationale for Frequency.” Also identified here is any other information sources used to support the Frequency rating assigned. Examples include Design and Process FMEAs (e.g. Prevention Controls), test results, field data, and warranty claims history.
In column C, “Current Diagnostic Monitoring,” identify all elements of the diagnostic system that contribute to detection of the Cause of Failure, Failure Mode, or Effect of Failure by the system or the operator of the vehicle. In the “Current System Response” column (D), describe the system’s reaction to a fault condition. Include any functions that are partially or fully disabled, fault codes recorded, messages to driver, etc. If no monitoring is in place, describe the hazardous condition created by the failure.
To assess the effectiveness of the controls in detecting a fault condition and activating an appropriate and sufficient response to prevent a hazardous event, a Monitoring (M) rating is assigned. To do this, consult the Monitoring criteria table, shown in Exhibit 14. In each of the “Diagnostic Monitoring” and System Response” columns, select the statement that best describes the system’s performance. Record the higher of the corresponding M ratings in column E of the FMEA-MSR form (Exhibit 12).
There are three possible fault-monitoring scenarios, but only one requires significant depth of analysis. The first scenario is one in which the diagnostic system is ineffective or none has been implemented. In this case, M = 10 and no adjustment is made.
The second scenario is the opposite of the first – the diagnostic system is reliable, consistently producing a mitigated Effect while preventing a hazardous event. In this scenario, M = 1 and no adjustment is made.
The third fault-monitoring scenario is “in between” or a hybrid of the first two. This scenario involves a diagnostic system that is partially effective; that is, the fault condition may be, but will not always be detected. The amount of adjustment in the M rating depends on the proportion of occurrences that the system is expected to detect; “diagnostic coverage” estimates are given in the “Diagnostic Monitoring” criteria descriptions (see Exhibit 14). A demonstration of this scenario’s rating method is shown pictorially in Exhibit 15.
In column F, describe the mitigated Effect of Failure resulting from the diagnostic system’s intervention after detecting the fault condition. Record the Severity score associated with the mitigated Effect in column G. In column H, copy the Severity score, from Step 4, corresponding to the original, unmitigated Effect of Failure.
Using the lookup table in Exhibit 16, assign an Action Priority (AP) to the failure chain. The three possible priorities have similar definitions to those in previously presented FMEAs:
To determine the correct AP, the following guidelines have been established (see FMEA-MSR AP Table end notes):
6th Step – Optimization
In the Optimization step, improvement activities are identified to reduce risk, assigned to individuals for implementation, monitored for progress, and evaluated for effectiveness. Relevant information is recorded in the Optimization section of the DFMEA form, shown in Exhibit 17.
The Handbook asserts that the most effective sequence for Optimization is as follows:
Column S of the FMEA-MSR form (Exhibit 17) is used to identify the status of each action plan. The notation used in DFMEA should also be used in FMEA-MSR; see Vol. VI for suggestions.
The grouped “columns T” are used to record predicted values of F, M, and AP. The original Severity score (from Step 4) is also repeated here because it may be needed to determine the proper Action Priority to assign. The AP is determined according to the same guidelines set forth in Risk Analysis (Step 5).
7th Step – Results Documentation
The recommendations for preparation of an FMEA report following FMEA-MSR are identical to those presented in the discussion of the aligned Design FMEA. Therefore, the reader is referred to “Vol. VI: DFMEA” and the AIAG/VDA Handbook for details.
A Supplemental FMEA for Monitoring and System Response is a worthwhile extension of the FMEA Handbook and of product development and analysis. It reflects a recognition of the critical role that driver aids, communication, advanced safety systems, and other current and emerging technologies play in modern vehicle development. Regulatory compliance, passenger safety, operator convenience, and many other aspects of customer experience rely on these technologies to meet evolving expectations. Predictable performance and minimized risk are increasingly important as vehicles continue to become more sophisticated and customers more discerning. As this trend continues, Monitoring and System Response may be the greatest source of competitive advantage currently available to automakers.
For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I: Introduction to Failure Modes and Effects Analysis.
[Link] “FMEA Handbook.” Automotive Industry Action Group and VDA QMC, 2019.
Jody W. Phelps, MSc, PMP®, MBA
JayWink Solutions, LLC
If you'd like to contribute to this blog, please email email@example.com with your suggestions.
© JayWink Solutions, LLC