<![CDATA[JayWink Solutions - Blog]]>Wed, 29 Jun 2022 20:52:15 -0400Weebly<![CDATA[FMEA – Vol. VIII:  Monitoring and System Response]]>Wed, 29 Jun 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-viii-monitoring-and-system-response     As mentioned in the introduction to the AIAG/VDA aligned standard (“Vol. V:  Alignment”), the new FMEA Handbook, is a significant expansion of its predecessors.  A substantial portion of this expansion is the introduction of a new FMEA type – the Supplemental FMEA for Monitoring and System Response (FMEA-MSR).
     Modern vehicles contain a plethora of onboard diagnostic tools and driver aids.  The FMEA-MSR is conducted to evaluate these tools for their ability to prevent or mitigate Effects of Failure during vehicle operation.
     Discussion of FMEA-MSR is devoid of comparisons to classical FMEA, as it has no correlate in that method.  In this installment of the “FMEA” series, the new analysis will be presented in similar fashion to the previous aligned FMEA types.  Understanding the aligned Design FMEA method is critical to successful implementation of FMEA-MSR; this presentation assumes the reader has attained sufficient competency in DFMEA.  Even so, review of aligned DFMEA (Vol. VI) is highly recommended prior to pursuing FMEA-MSR.
     Following the precedent of DFMEA and PFMEA, the “Standard FMEA-MSR Form Sheet” (“Form H” in Appendix A of the AIAG/VDA FMEA Handbook) is color-coded, but otherwise deviates significantly.  As shown in Exhibit 1, the FMEA-MSR form contains information from DFMEA (Steps 1 – 6, in the upper portion of the stacked format).  The information needed for FMEA-MSR may not be readily available in the original DFMEA, however, as the control systems may have been deemed out of scope.  Therefore, the structure tree and original analysis may need to be expanded prior to initiating the new method of analysis. The lower portion of the stacked format, containing information from the Supplemental FMEA (Steps 5 – 6) represent this new method.  This format reinforces that FMEA-MSR is a supplemental analysis based on DFMEA.
     Like previous installments in this series, portions of the form are shown in close-up, with reference bubbles correlating to discussion in the text.  The goal is to facilitate learning this new analysis method by maintaining the links to the 7-step approach, DFMEA, form sections, and individual columns where information is recorded.
 
Conducting a Supplemental FMEA for Monitoring and System Response
     Supplemental Failure Modes and Effects Analysis for Monitoring and System Response is conducted in the same three stages and seven steps followed to complete a DFMEA or PFMEA.  A pictorial representation of the Seven Step Approach is reproduced in Exhibit 2.  Though there are few hints exclusive to FMEA-MSR, this summary diagram should be referenced as one proceeds through this presentation and while conducting an FMEA-MSR to ensure effective analysis.
System Analysis
1st Step – Planning & Preparation
     Developing a project plan, based on the Five Ts (see Vol. V), is once again recommended to provide structure for the analysis.  The context limitations (new or modified design or application) remain, as do the levels of analysis.  Supplemental analysis differs from DFMEA insomuch that FMEA-MSR is focused on vehicle systems that serve diagnostic, notification, and intervention functions during vehicle operation.  These systems are often referenced in the Risk Analysis (Step 5) section of the DFMEA form.
     To assist in scope definition, the Handbook proffers three basic questions; affirmative responses indicate the subject failure chain is appropriately in the scope of analysis.
1) “After completing a DFMEA on an Electrical/Electronic/Programmable Electronic System, are there effects that may be harmful to persons or involve regulatory noncompliance?”  Such Effects are typically assigned the highest Severity scores (S = 9 – 10).
2) “Did the DFMEA indicate that all of the causes which lead to harm or noncompliance can be detected by direct sensing and/or plausibility algorithms?”  This suggests the ideal Detection score (D = 1).
3) “Did the DFMEA indicate that the intended system response to any and all of the detected causes is to switch to a degraded operational state (including disabling the vehicle), inform the driver and/or write a Diagnostic Trouble Code (DTC) into the control unit for service purposes?”  The answer to this question is typically found in the Current Controls portion of the DFMEA, as mentioned above.
     Deriving the Supplemental analysis from the DFMEA suggests that the core team remains the same; in particular, Design Responsibility should remain with the same individual.  The extended team members require expertise in diagnostic systems and the vehicle functions they are used to monitor.
     Refer to “Vol. VI:  ‘Aligned’ DFMEA” and “Vol. III: ‘Classical’ DFMEA” for more information on the “Planning and Preparation (Step 1)” section of the FMEA-MSR form.
 
Continuous Improvement
     The “History/Change Authorization” column is carried over from DFMEA.  The manner of use in prior analyses should be continued per established organizational norms.
 
2nd Step – Structure Analysis
     In the Structure Analysis step, the system or element named in Step 1 is defined using a structure tree or block diagram.  Only those elements, identified in the DFMEA, with Causes of Failure that show potential for harm to persons or regulatory noncompliance (see question 1, Step 1, above) are in scope.  Analysis can be performed at vehicle level (OEM) or at a subsystem level (supplier).
     System interfaces may also be in scope.  When a control unit is in scope, but a sensor from which it receives a signal, or actuator to which it transmits a signal, is not, the connector that allows the device to communicate with the control unit (i.e. interface) should be included in the analysis.  The signal status (intermittent, lost, incorrect, etc.) can then be considered in analysis of the control unit and the connected device via this interface.  An example of this type of interface analysis is depicted in Exhibit 3, where an input/output connector links two structure trees.
     Contrary to the “assume all inputs are acceptable” philosophy of previous FMEAs, erroneous signals from elements outside the scope of responsibility of the design team should be included in the analysis.  The objective is to ensure safety and compliance, even when system inputs are corrupted.
     The Structure Analysis (Step 2) section of the FMEA-MSR form is a subset of the completed DFMEA.  Information can be transferred from the DFMEA and structure tree previously created or expanded to include diagnostic elements.
 
3rd Step – Function Analysis
     Once again drawing from the structure tree and DFMEA, the Function Analysis (Step 3) section of the FMEA-MSR form can be completed from these sources, as shown in Exhibit 4.  A P-Diagram could also be used to organize and visualize the diagnostic system elements.  Functions of interest include those associated with failure detection and response.  Elements may include hardware, software, and communication signals; interface elements may be included when the connected device is out of scope.
Failure Analysis and Risk Mitigation
4th Step – Failure Analysis
     For purposes of analysis, the diagnostic systems are assumed to function as intended.  That is, FMEA-MSR is conducted to evaluate the sufficiency of designed system responses, not the system’s reliability.  Causes of Failures of diagnostic systems (undetected faults, false alarms, etc.) can be included in the DFMEA, however.
     Supplemental FMEA-MSR requires consideration of three failure scenarios that differ in the occurrence of a hazardous event, detection of a failure, and effectiveness of failure response.  The first scenario involves a fault condition that causes malfunctioning behavior (Failure Mode) resulting in Effects of Failure, but no hazardous event.  No detection or response occurs, though a noncompliant condition may be created.  This scenario is depicted in Exhibit 5.
     The second scenario differs from the first in that a hazardous event occurs, as depicted in Exhibit 6.  Again, no detection or response occurs; this may be due to a slow or inoperable monitoring or intervention mechanism.  The time available for a system response to prevent a hazardous event is the Fault Handling Time Interval – the time elapsed between occurrence of a fault and a resultant hazardous event.  The minimum Fault Handling Time Interval is the Fault Tolerant Time Interval – the time elapsed between occurrence of a fault and Effect of Failure.  In Scenario 2, the monitoring system in place, if any, is insufficient to prevent the hazardous event.
     The third scenario incorporates fault detection and system response to mitigate Effects of Failure and prevent occurrence of a hazardous event.  In this scenario, depicted in Exhibit 7, the user experiences a partial loss or degradation of function (mitigated Effect of Failure) in lieu of the hazardous event.  System response time must be less than the Fault Tolerant Time Interval to ensure consistent performance in service.
     The final failure scenario can also be represented by the “hybrid failure chain” depicted in Exhibit 8.  The hybrid failure chain adds a chain parallel to the DFMEA failure chain; it includes monitoring for Causes of Failure, the response activated when a fault is detected, and the substitute, or mitigated, Effects of Failure.  In the Monitoring and System Response (MSR) scenario, analysis is shifted from the DFMEA failure chain to the MSR failure chain (lower portion of the stacked-format FMEA-MSR form).  An example hybrid failure chain, in structure tree format, is shown in Exhibit 9.  This includes a description of intended behavior (Failure Mode) and mitigated Effects.
     An Effect of Failure in FMEA-MSR can be a malfunctioning behavior, as in DFMEA; it can also be an intended behavior or designed response to detection of a Cause of Failure.  If the designed response (“intervention”) is effective, a “safe state” will result, albeit with some loss of function.  In Scenarios 1 and 2, the Failure Mode remains that in the DFMEA.  This is the failure chain recorded in the Failure Analysis (Step 4) section of the FMEA-MSR form, as shown in Exhibit 10.
     To complete the Failure Analysis section, assign a Severity (S) score to the Effect of Failure.  The same Severity criteria table is used in DFMEA and FMEA-MSR; it is reproduced in Exhibit 11.  Refer to Vol. VI for further guidance on assigning Severity scores and use of the criteria table.
5th Step – Risk Analysis
     Risk Analysis is the first step for which the lower (“new”) section of the FMEA-MSR is used.  Additional information is needed in the DFMEA (upper) section, however, creating two branches of analysis.  In DFMEA, identify the Prevention Controls that allow the diagnostic system to reliably detect Causes of Failure and initiate the system response within the Fault Tolerant Time Interval.  Also, identify the Detection Controls in use to verify the diagnostic system’s effectiveness; simulation and other development tests are often cited.
     Thus far, the discussion has focused on conducting DFMEA in a slightly different way than previously presented.  However, the remainder of the DFMEA section is completed as described in “Vol. VI: ‘Aligned’ DFMEA.”  The remainder of this presentation, therefore, can focus on the unique aspects of the Supplemental FMEA for Monitoring and System Response and completing the FMEA-MSR form.
     The Frequency (F) rating is comparable to the Occurrence score in DFMEA and PFMEA.  It is an estimate of the frequency of occurrence of the Cause of Failure in applicable operating conditions during the vehicle’s planned service life.  In the Frequency criteria table, shown in Exhibit 13, select the statement that best describes the likelihood of the Cause of Failure occurring; record the corresponding F rating in column B of the FMEA-MSR form (Exhibit 12).
     Note that the qualifier emphasized above – in applicable operating conditions – may be invoked to adjust the F rating.  The conditions in which this can be done are outlined in the note below the F table in Exhibit 13.  Stated generally, the Frequency rating may be reduced if the conditions in which the Cause of Failure that leads to the associated Effect of Failure exists only during a small portion of the time the vehicle is in operation.
     If the F rating is reduced for this reason, the relevant operating conditions are recorded in column A of the form – “Rationale for Frequency.”  Also identified here is any other information sources used to support the Frequency rating assigned.  Examples include Design and Process FMEAs (e.g. Prevention Controls), test results, field data, and warranty claims history.
     In column C, “Current Diagnostic Monitoring,” identify all elements of the diagnostic system that contribute to detection of the Cause of Failure, Failure Mode, or Effect of Failure by the system or the operator of the vehicle.  In the “Current System Response” column (D), describe the system’s reaction to a fault condition.  Include any functions that are partially or fully disabled, fault codes recorded, messages to driver, etc.  If no monitoring is in place, describe the hazardous condition created by the failure.
     To assess the effectiveness of the controls in detecting a fault condition and activating an appropriate and sufficient response to prevent a hazardous event, a Monitoring (M) rating is assigned.  To do this, consult the Monitoring criteria table, shown in Exhibit 14.  In each of the “Diagnostic Monitoring” and System Response” columns, select the statement that best describes the system’s performance.  Record the higher of the corresponding M ratings in column E of the FMEA-MSR form (Exhibit 12).
     There are three possible fault-monitoring scenarios, but only one requires significant depth of analysis.  The first scenario is one in which the diagnostic system is ineffective or none has been implemented.  In this case, M = 10 and no adjustment is made.
     The second scenario is the opposite of the first – the diagnostic system is reliable, consistently producing a mitigated Effect while preventing a hazardous event.  In this scenario, M = 1 and no adjustment is made.
     The third fault-monitoring scenario is “in between” or a hybrid of the first two.  This scenario involves a diagnostic system that is partially effective; that is, the fault condition may be, but will not always be detected.  The amount of adjustment in the M rating depends on the proportion of occurrences that the system is expected to detect; “diagnostic coverage” estimates are given in the “Diagnostic Monitoring” criteria descriptions (see Exhibit 14).  A demonstration of this scenario’s rating method is shown pictorially in Exhibit 15.
     In column F, describe the mitigated Effect of Failure resulting from the diagnostic system’s intervention after detecting the fault condition.  Record the Severity score associated with the mitigated Effect in column G.  In column H, copy the Severity score, from Step 4, corresponding to the original, unmitigated Effect of Failure.
 
     Using the lookup table in Exhibit 16, assign an Action Priority (AP) to the failure chain.  The three possible priorities have similar definitions to those in previously presented FMEAs:
  • H:  High Priority – actions must be taken to lower frequency and/or improve monitoring or acceptance of current controls justified and approved.
  • M:  Medium Priority – actions should be taken to lower frequency and/or improve monitoring or acceptance of current controls justified and approved.
  • L:  Low Priority – actions could be taken to lower frequency and/or improve monitoring, but justification and approval are not required to accept current controls.
     To determine the correct AP, the following guidelines have been established (see FMEA-MSR AP Table end notes):
  • If M = 1, use S after MSR (column G) to determine AP in FMEA-MSR and DFMEA.
  • If M ≠ 1, use original S (column H) to determine AP in FMEA-MSR and DFMEA.
Record the AP assigned in column J of the FMEA-MSR form (Exhibit 12).  See presentations in Vol. VI (DFMEA) and Vol. VII (PFMEA) for guidance on the use of the “Filter Code” column (K).
 
6th Step – Optimization
     In the Optimization step, improvement activities are identified to reduce risk, assigned to individuals for implementation, monitored for progress, and evaluated for effectiveness.  Relevant information is recorded in the Optimization section of the DFMEA form, shown in Exhibit 17.
     The Handbook asserts that the most effective sequence for Optimization is as follows:
  • Reduce the occurrence of Causes of Failure via design modification [lower Frequency (F), MSR Preventive Action].
  • Improve the ability to detect Causes of Failure or Failure Modes [lower Monitoring (M), Diagnostic Monitoring Action].
     Record the planned Optimization actions in columns L and M, respectively, of the FMEA-MSR form.  In column N, describe the way in which the system will react to a fault condition after the planned action is complete (see column D, Step 5).  In column P, describe the mitigated Effect of Failure produced after the planned action is complete (see column F, Step 5).  Evaluate the Severity of the new mitigated Effect and record its score in column R.
     Column S of the FMEA-MSR form (Exhibit 17) is used to identify the status of each action plan.  The notation used in DFMEA should also be used in FMEA-MSR; see Vol. VI for suggestions.
     The grouped “columns T” are used to record predicted values of F, M, and AP.  The original Severity score (from Step 4) is also repeated here because it may be needed to determine the proper Action Priority to assign.  The AP is determined according to the same guidelines set forth in Risk Analysis (Step 5).
 
Risk Communication
7th Step – Results Documentation
     The recommendations for preparation of an FMEA report following FMEA-MSR are identical to those presented in the discussion of the aligned Design FMEA.  Therefore, the reader is referred to “Vol. VI:  DFMEA” and the AIAG/VDA Handbook for details.
 
 
     A Supplemental FMEA for Monitoring and System Response is a worthwhile extension of the FMEA Handbook and of product development and analysis.  It reflects a recognition of the critical role that driver aids, communication, advanced safety systems, and other current and emerging technologies play in modern vehicle development.  Regulatory compliance, passenger safety, operator convenience, and many other aspects of customer experience rely on these technologies to meet evolving expectations.  Predictable performance and minimized risk are increasingly important as vehicles continue to become more sophisticated and customers more discerning.  As this trend continues, Monitoring and System Response may be the greatest source of competitive advantage currently available to automakers.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I:  Introduction to Failure Modes and Effects Analysis.
 
References
[Link] “FMEA Handbook.”  Automotive Industry Action Group and VDA QMC, 2019.


Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[FMEA – Vol. VII:  “Aligned” Process Failure Modes and Effects Analysis]]>Wed, 15 Jun 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-vii-aligned-process-failure-modes-and-effects-analysis     To conduct a Process FMEA according to AIAG/VDA alignment, the seven-step approach presented in Vol. VI (Aligned DFMEA) is used.  The seven steps are repeated with a new focus of inquiry.  Like the DFMEA, several system-, subsystem-, and component-level analyses may be required to fully understand a process.
     Paralleling previous entries in the “FMEA” series, this installment presents the 7-step aligned approach applied to process analysis and the “Standard PFMEA Form Sheet.”  Review of classical FMEA and aligned DFMEA is recommended prior to pursuing aligned PFMEA; familiarity with the seven steps, terminology used, and documentation formats will make aligned PFMEA more comprehensible.
     Like the DFMEA form presented in Vol. VI, the “Standard PFMEA Form Sheet” (“Form C” in Appendix A of the AIAG/VDA FMEA Handbook) correlates information between steps using numbered and color-coded column headings.  The aligned PFMEA form is reproduced, in stacked format, in Exhibit 1.  In use, information related to a single issue should be recorded in one row.
     As has been done in previous “FMEA” series entries, portions of the PFMEA form will be shown in close-up, with reference bubbles correlating to discussion in the text.  Not every entry in the form is discussed in detail; some are straightforward and self-explanatory.  Consult previous installments of the “FMEA” series to close any perceived gaps in the information presented.  If further guidance is needed, several options for contacting the author are available at the end of this article.
 
Conducting an ‘Aligned’ Process FMEA
     Failure Modes and Effects Analysis is conducted in three “stages” – System Analysis, Failure Analysis and Risk Mitigation, and Risk Communication.  These three stages are comprised of the seven-step process mentioned previously.  A graphical representation of the relationships between the three stages and seven steps is shown in Exhibit 2.  For each step, brief reminders are provided of key information and activities required or suggested.  Readers should reference this summary diagram as each step is discussed in this presentation and while conducting an FMEA.
System Analysis
1st Step – Planning & Preparation
     Classical FMEA preparations have often been viewed as separate from analysis.  Recognizing the criticality of effective planning and preparation, the aligned process from AIAG and VDA formally incorporate them in the 7-step approach to FMEA.  This is merely a change in “status,” if you will, for preparatory activities.  Thus, the discussion in “Vol. II:  Preparing for Analysis” remains valid, though introduction of the tools is dispersed among steps in the aligned approach.
     Specifically, the realm of appropriate context for FMEA remains the three uses cases defined, briefly described as (1) new design, (2) modified design, or (3) new application.  Within the applicable context, the FMEA team must understand the level of analysis required – system, subsystem, or component.
     The core and extended analysis team members are chosen in the same manner as before.  Defining customers, from subsequent workstations to end user, remains integral to thorough analysis. Doing so in advance facilitates efficient analysis, with all foreseeable Effects of Failure identified.

     To state it generally, the inputs needed to conduct an effective FMEA have not changed.  However, the aligned process provides a framework for organizing the accumulated information into a coherent project plan.  Using the Five Ts structure introduced in “FMEA – Vol. V:  Alignment,” project information is presented in a consistent manner.  The scope of analysis, schedule, team members, documentation requirements, and more are recorded in a standardized format that facilitates FMEA development, reporting, and maintenance.
     Key project information is recorded on the PFMEA form in the header section labeled “Planning & Preparation (Step 1),” shown in Exhibit 3.  The labels are largely self-explanatory, though some minor differences exist between the aligned and classical forms.  One such difference is the definition of the “Subject” (1) of analysis.  On this line, identify the process analyzed by name, location (e.g. line number, department, facility), and any commonly used “nickname” needed to differentiate it from similar processes.
     Key Date on the classical form has been replaced by “Start Date” (2) on the aligned form; both refer to the design freeze date.  “Confidentiality Level” (3) has been added to the form.  Two levels are suggested in the Handbook – “Proprietary” and “Confidential” – but no further guidance on their application is provided.  Discussions should take place within an organization and with customers to ensure mutually agreeable use of these designations and information security.
     Refer to the “FMEA Form Header” section of “Vol. IV:  ‘Classical’ Process Failure Modes and Effects Analysis” for additional guidance on completing this section of the form.
 
Continuous Improvement
     The first two columns in the body of the PFMEA form, shown in Exhibit 4, are not included in the discussion of the seven steps.  The first column, “Issue #” (A), is used to simply number entries for easy reference.  The purpose of column B, “History/Change Authorization,” is left to users’ interpretation.  Presumably, it is used to record management approval of process changes or acceptance of existing risk, though the Handbook offers no guidance on this topic.  In whatever fashion an organization chooses to use this portion of the form, it should be documented in training to ensure consistency and minimize confusion.
2nd Step – Structure Analysis
     Structure Analysis may be less intuitive in a process context than it is for a physical object.  The structure hierarchy, however, remains the same:  Effects of Failure, Failure Modes, Causes of Failure.
     The Structure Analysis section of the PFMEA form is shown in Exhibit 5.  In the first column (C), identify the Process Item, “the highest level of integration within the scope of analysis.”  That is the result achieved by successful completion of all Process Steps, where Effects of Failure would be noticed.  The Process Step (D) is the focus of the analysis – a process operation or workstation where a Failure Mode occurs.
     Each Process Step may be associated with several Process Work Elements (E), where the Causes of Failure lie.  The Handbook highlights four main types (“4M Categories”) of Process Work ElementMachine, Man, Material, and EnvironMent.  Each category is analyzed separately for each Process Step.
     To organize information needed for Structure Analysis, creating a process flow diagram (PFD) is a great place to start (see Commercial Cartography – Vol. II:  Flow Charts).  A single PFD may contain the scope of several PFMEAs, once the complexity of the system and subsystems are considered.  A complete PFD supports rational decisions (see Making Decisions – Vol. I:  Introduction and Terminology) to define a manageable scope for each analysis.
     The flow of information from process flow diagram to structure tree to PFMEA is shown in Exhibit 6.  Enter information in the PFMEA form proceeding across (left to right), then down, to ensure that each Process Work Element has been thoroughly considered before proceeding to the next Process Step.
3rd Step – Function Analysis
     Collection and organization of information to support Function Analysis can be done with a parameter diagram.  As shown in Exhibit 7, the basic layout of the process parameter diagram (P-Diagram) is the same as that of the product P-Diagram, with changes made to accommodate differing informational needs and terminology.
     The Function Analysis section of the PFMEA form, shown in Exhibit 8, is numbered and color-coded to maintain links to information in Step 2.  In the first column (F), describe the function of the Process Item at the level of integration (i.e. system, subsystem) chosen for the analysis.  Multiple functions or multiple customers may be served by a Process Item; be sure that all are included.
     In column G, describe the function of the Process Step and the Product Characteristic it is intended to spawn.  Quantitative specifications can be included, but there is a caveat.  Values defined elsewhere, such as a product drawing or process instruction, are subject to revision in the source document.  This could lead to obsolete information references in the PFMEA or frequent revisions to prevent it.
     Column H contains descriptions of Process Work Element functions and Process Characteristics they are intended to achieve.  These will correspond to the 4M Categories or other types of Process Work Elements defined in Step 2.
     Expanding the structure tree used in Step 2 to include descriptions of the function of each element allows direct transfer of information to the PFMEA form.  This information transfer is demonstrated pictorially in Exhibit 9.
     The Handbook establishes guidelines for function descriptions.  To ensure consistency and clarity, function descriptions should conform to the following:
  • a verb followed by a noun (“Do this to This.”)
  • use the verb’s base form
  • use present tense
  • use positive words
  • answers the question “What does it do?” when working from right to left
  • answers the question “How does it do it?” when working from left to right
     Product Characteristics are the requirements or specifications of the product design.  They may originate in legal or regulatory requirements, customer specifications, industry standards, or corporate governance (internal standards).  Product Characteristics are evaluated after the product is complete.
     Process Characteristics can be evaluated during process execution.  These are the controls that ensure that a process results in the Product Characteristics required.  They may originate from process specifications, work instructions, or other process verification documentation.
     Effort expended in Function Analysis is rewarded in the next step.  Clear, concise function descriptions expedite Failure Analysis by providing phraseology that can be duplicated while maintaining accuracy of meaning.  Diligence in this step results in a more efficient analysis overall.
 
Failure Analysis and Risk Mitigation
4th Step – Failure Analysis
     Failure Analysis is the heart of an FMEA, where a system’s failure network is established.  The links between Failure Modes, Effects of Failure, and Causes of Failure, in multiple levels of analysis, are made clear in this step.
     The definitions of Failure Mode, Effect of Failure, and Cause of Failure, in practice, remain the same as in classical FMEA.  Briefly, these are:
  • Failure Mode:  the manner in which the process could cause a Product Characteristic or intended function to be unrealized, described in technical terms.
  • Effect of Failure:  the result of a failure, as perceived by an internal or external customer.
  • Cause of Failure:  the reason a Failure Mode occurs, described in technical terms.
The relationships between a Failure Mode, its cause, and the resultant effect creates a failure chain, as shown in Exhibit 10.  Each link in the failure chain is the consequence of the existence of the one to its right in the diagram.  That is, the Effect of Failure is the consequence, or result, of the occurrence of the Failure Mode, which is, in turn, the consequence of the existence of the Cause of Failure.
     Each element is not limited to a single failure chain.  The occurrence of a Failure Mode may be perceived in several ways (Effects of Failure) or have multiple potential causes.  A Cause of Failure may also result in several potential Failure Modes, and so on.  Proper identification of each is critical to effective analysis.
     Each failure chain is documented in a single row of the PFMEA form, in the Failure Analysis section, shown in Exhibit 11.  The form maintains the relationships within the failure chain, as shown in Exhibit 10 and discussed above.  It also maintains the numbering and color-coding convention that links Failure Analysis to Function Analysis (Step 3) and Structure Analysis (Step 2).
 
     The Failure Mode is the Focus Element of the process failure chain and, typically, the focus of process owners.  Deviations are discovered in process-monitoring data and inspection reports before customers are effected by errors.  This is, at least, the goal – to contain any failures that could not be prevented.
     Process Failure Modes vary widely, as there are many types of processes that may be the subject of analysis.  Machined components may have a feature of incorrect size or location.  Printed circuit boards could have weak solder joints.  Parts could be assembled in the wrong orientation, or missing altogether.  Threaded fasteners could be tightened to an incorrect torque.  The possibilities are seemingly endless.
     Fortunately, only a finite subset of the endless potential Failure Modes are applicable to a given process.  When one is identified, its opposite, or “mirror-image” condition (e.g. high/low, long/short, etc.), if one exists, should also be considered.  Record all Failure Modes of the Process Step in column L of Exhibit 11.  These are the negatives of the Functions of the Process Step and Process Characteristics defined in Step 3.
     Aspects of a design that increase the frequency or likelihood of process failure should be discussed with the Design Responsible individual.  For example, a part feature may be difficult to machine correctly, or an assembly may lack sufficient access space to install a required component consistently and without damage.  Collaboration between Design and Process Responsible teams, such as Design for Manufacturability and Assembly (DFMA) studies, should take place “early and often” during product development to minimize these issues and their associated cost and delays.
 
     Effects of Failure must be considered for all customers, including:
  • Subsequent internal processes
  • Corporate management
  • OEM and supplier tiers
  • Regulatory agencies
  • End users (i.e. vehicle operators)
  • NGOs (Nongovernmental organizations), such as environmental or other advocacy groups.
An Effect of Failure can be expressed as the negative, or inverse, of the Function of the Process Item defined in Step 3 (i.e. “Doesn’t Do this to This.”), but need not be.  Examples include:
  • unable to assemble [e.g. missing part feature] (subsequent internal process)
  • vehicle will not start [e.g. battery discharged] (end user)
  • uncontrolled emissions [e.g. faulty filter] (regulatory agencies, NGOs)
  • reduced throughput/productivity [e.g. insufficient space to access components] (corporate management)
  • unable to connect [e.g. interface or connector mismatch] (OEM or next-tier supplier]
     If a process failure can create the same Effect as identified in the product’s DFMEA, the same wording and Severity score should be used in the PFMEA.  The diagram in Exhibit 12 demonstrates how this link between a product’s DFMEA and the analysis of a process required to produce it can manifest.
     If downstream impacts are not known, due to strict customer confidentiality, multiple-tier separation from users, or any other reason, Effects of Failure should be defined in terms of what is known.  That is, part drawings, process specifications, or other provided information should be used to describe the Effects.
     Record descriptions of Effects of Failure in column J of the PFMEA form (Exhibit 11).  Include all known Effects, identified by customer (OEM, Tier 1, end user, etc.), not only the “worst.”
 
     Causes of Failure are defined in terms of the 4M Categories or other Process Work Element types first identified in Structure Analysis (Step 2).  Deviations in the physical environment, machine settings, material usage, operator technique, or other aspects of the system could cause a process failure.  Identify those aspects that reside in the subject failure chain in column M (Exhibit 11).
 
     As is the case for the previous two steps, information can be transferred directly from the structure tree, were it created in advance, to the PFMEA form.  The Failure Analysis information transfer is shown in Exhibit 13.
     To complete the Failure Analysis step, each Effect of Failure must be evaluated and assigned a Severity (S) score.  To do this, consult the Severity criteria table, shown in Exhibit 14.  Select the column corresponding to the customer effected, then the criteria in that column that best describes the impact to that customer.  The left-most column in the row containing this description contains the Severity score for the Effect of Failure; enter the number in the PFMEA form (column K of Exhibit 11).
     If there is uncertainty or disagreement about the appropriate Severity score to assign an Effect of Failure (e.g. “Is it a 3 or a 4?”), select the highest score being considered to ensure that the issue receives sufficient attention as development proceeds.  Confidence in evaluations and scores typically increase as a design matures; concordance often follows.
     The last column of the “PFMEA Severity Criteria Table” (Exhibit 14) is left blank in the Handbook because it is not universally applicable.  An organization can record its own examples of Effects of Failure to be used as comparative references when conducting a new FMEA or to aid in training.  Examples cited for various Severity scores can provide great insight into an organization’s understanding, or lack thereof, of its customers’ perspectives.
 
5th Step – Risk Analysis
     The process development team conducts Risk Analysis to identify the controls used to prevent or detect failures, evaluate their effectiveness, and prioritize improvement activities.  Relevant information is recorded in the Risk Analysis section of the PFMEA form, shown in Exhibit 15.
     In column N, record the Process Prevention Controls incorporated to preclude activation of the failure chain.  There are three types of controls that can be implemented (poka yoke, engineering controls, and management controls); these are explained in The War on Error – Vol. II:  Poka Yoke (What Is and Is Not).  Cite only those expected to prevent the specific Cause of Failure with which they are correlated (i.e. in the same row on the form).
     Though not an exhaustive list of Process Prevention Controls, some examples follow:
  • processing guidelines published by an equipment manufacturer, material supplier, industry group, or other recognized authority
  • re-use of previously validated process design or equipment with history of successful service
  • error-proofing (see “The War on Error – Vol. II”)
  • work instructions
  • equipment maintenance
     The effectiveness of Process Prevention Controls is reflected in the Occurrence (O) score.  To assign an Occurrence score to each Cause of Failure, consult the Occurrence criteria table shown in Exhibit 16.  Select the Occurrence score that corresponds to the criteria that best describes the maturity of the prevention controls.  Consider the hierarchy of controls (see “The War on Error – Vol. II”) when assigning Occurrence scores; O = 1 should be reserved for frequently verified poka yoke devices.  Engineering controls typically receive mid-range scores (e.g. O = 2 – 5), while management controls have high scores (e.g. O = 6 – 9).
     If sufficient data is available to make reasonable predictions, a quantitative method of scoring can be used.  For this, substitute one of the alternate PFMEA Occurrence criteria tables, shown in Exhibit 17, for the table of Exhibit 16Occurrence criteria based on production volume are shown in Exhibit 17A, while time-based criteria are shown in Exhibit 17B.  The detailed criteria descriptions are unchanged; the qualitative summary terms (high, low, etc.) are simply replaced by “incidents per 1000” estimates or time intervals.
     If the frequency of occurrence of a Cause of Failure “falls between” two Occurrence scores, or there is disagreement about the correct frequency, select the higher Occurrence score.  Additional consideration in subsequent reviews is likely to resolve the matter more efficiently than extensive debate in early stages.
     The standard and alternate PFMEA Occurrence tables have a column left blank for organization-specific examples to be recorded.  These can be used as comparative references to facilitate future PFMEA development or as training aids.
 
     In column P of the Risk Analysis section of the PFMEA form, identify the Process Detection Controls in place to warn of the existence of the Failure Mode or Cause of Failure before the part reaches a customer.  Examples include visual inspection and end-of-line testing.
     Assess the effectiveness of current detection controls according to the Detection criteria table, shown in Exhibit 18.  Select the Detection (D) score that corresponds to the Detection Method Maturity and Opportunity for Detection descriptions that most accurately reflect the state of the controls and enter it in the PFMEA form.
     If there is a discrepancy between the Detection Method Maturity and Opportunity for Detection, for example, select the higher Detection score.  As a process matures, its controls evolve.  Review in subsequent development cycles is more efficient than dwelling on the matter in early stages.
     Like the S and O tables, the D criteria table has a column left blank for organization-specific examples.  Future PFMEA development and training may benefit from the experience captured here.

     Determining the priority of improvement activities in aligned FMEA is done very differently than the classical method.  For this purpose, the AIAG/VDA Handbook introduces Action Priority (AP), where a lookup table replaces the RPN calculation of classical FMEA.  The same AP Table is used for DFMEA and PFMEA; it is reproduced in Exhibit 19.
     The AP Table is used to assign a priority to improvement activities for a failure chain.  To do this, first locate the row containing the Severity score in the S column.  Then, in the O column, find the sub-row containing the Occurrence score, then the sub-row containing the Detection score in the D column.  In this sub-row, in the Action Priority (AP) column, one of three priorities is assigned:
  • H:  High Priority – actions must be taken to reduce risk or acceptance of the current process justified and approved.
  • M:  Medium Priority – risk should be reduced via improvement activities or acceptance of the current process justified and approved.
  • L:  Low Priority – improvement actions can be identified, but justification and approval are not required to accept the current process.
Record the Action Priority assigned on the PFMEA form (Exhibit 15).  Development of improvements and assignment of responsibility, based on the priorities assigned, takes place in the next step.
     The final column of the AP Table, “Comments,” is left blank.  Organization-specific protocols, historical projects, acceptance authority, or other information can be cited here to assist PFMEA teams in completing analyses.
 
     Column R (Exhibit 15) is reserved for Special Characteristics, though the text of the Handbook makes no reference to this portion of the Standard PFMEA form.  Presumably, it is to be used in the same manner as in classical FMEA, though the aligned Standard DFMEA form does not include it.  See previous entries in the “FMEA” series (Vol. III, Vol. IV) for a brief discussion of Special Characteristics.
     Column S in Exhibit 15 is an optional entry.  “Filter Code” could be used in various ways.  Examples include:
  • As a substitute for the “Classification” column when transitioning an existing design from classical to aligned format.  This may be done to remain consistent with the aligned DFMEA, if this practice was followed in that analysis.
  • To assign categories to aid in resource management.  These could correspond to engineering specialties, such as manufacturing, industrial, controls, and so on.  Subspecialties, such as PLC, HMI, etc., could also be coded, if desired.
  • Any creative use that assists the PFMEA owner in managing and completing the analysis or allows future analysis teams to easily search for similar issues in previous process developments.
 
6th Step – Optimization
     In the Optimization step, improvement activities are identified to reduce process risk, assigned to individuals for implementation, monitored for progress, and evaluated for effectiveness.  Relevant information is recorded in the Optimization section of the PFMEA form, shown in Exhibit 20.
     Preventive Actions (column T) are preferred to Detection Actions (column U); it is far better to eliminate an issue than to develop a better reaction to it.  The Handbook asserts that the most effective sequence of implementation is as follows:
  • Eliminate or mitigate Effects of Failure via process modification [lower Severity (S), Preventive Action]
  • Reduce the Occurrence (O) of Causes of Failure via process modification [Preventive Action]
  • Improve the ability to detect Causes of Failure or Failure Modes [lower Detection (D), Detection Action]
Prioritizing improvements in Severity over Occurrence and Detection is consistent with the classical FMEA approach.
     The Handbook suggests five possible statuses, to be recorded in column V, for the actions defined:
  • Open:  “No action defined.”
  • Decision pending:  action defined, awaiting approval.
  • Implementation pending:  action approved, awaiting implementation.
  • Completed:  implementation complete, effectiveness documented.
  • Not implemented:  action declined, current process accepted.
Column W (Exhibit 20) is reserved for anticipated changes in Special Characteristics resulting from implemented actions.  Refer to the Risk Analysis (Step 5) section for more information on the use of this portion of the form.
     The remaining columns in the Optimization section of the PFMEA form are, essentially, carryovers from classical FMEA, with predicted AP replacing predicted RPN.  Refer to Vol. IV for further discussion of these entries.
 
Risk Communication
7th Step – Results Documentation
     Much like Planning and Preparation, Results Documentation consists of tasks that have been performed for, but often considered separate from, classical FMEA.  The aligned approach formalizes Risk Communication as an essential component of analysis by incorporating it in the Seven-Step Approach.
     A significant portion of the content of an FMEA Report, as outlined in the Handbook, is contained in the PFMEA form.  For example, the scope of analysis, high-risk failures, action prioritization, status of actions, and planned implementation dates can be culled from the form.
     A comparison of the project plan, created in Step 1, with the execution and final status of the analysis may also be a useful component of the report.  Future analysis teams may be able to apply lessons learned from deviations revealed in this assessment.
     AIAG/VDA also suggest that a commitment to review and revise the PFMEA be included in the FMEA Report.  The discussion of classical PFMEA treated this as a separate endeavor, providing further evidence of the expanse of AIAG/VDA’s efforts to create a thorough and consistent analysis process.  For further discussion on the topic, see “Review and Maintenance of the PFMEA” in FMEA – Vol. IV:  “Classical” Process Failure Modes and Effects Analysis.
 
     A Process FMEA is a valuable development and communication tool.  It ensures that the impacts of a process on customers and other stakeholders are given proper consideration.  It ensures that legal and regulatory requirements are met.  It also creates a record of development activity that can be used to refine a process, develop a new process, or train others to develop successful processes.  Practitioners are encouraged to extract maximum value from FMEA and, in the effort, possibly discover a competitive advantage for their organization.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I:  Introduction to Failure Modes and Effects Analysis.
 
References
[Link] “Potential Failure Mode and Effects Analysis,” 4ed. Automotive Industry Action Group, 2008.
[Link] “FMEA Handbook.”  Automotive Industry Action Group and VDA QMC, 2019.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[FMEA – Vol. VI:  “Aligned” Design Failure Modes and Effects Analysis]]>Wed, 01 Jun 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-vi-aligned-design-failure-modes-and-effects-analysis     To differentiate it from “classical” FMEA, the result of the collaboration between AIAG (Automotive Industry Action Group) and VDA (Verband der Automobilindustrie) is called the “aligned” Failure Modes and Effects Analysis process.  Using a seven-step approach, the aligned analysis incorporates significant work content that has typically been left on the periphery of FMEA training, though it is essential to effective analysis.
     In this installment of the “FMEA” series, development of a Design FMEA is presented following the seven-step aligned process.  Use of an aligned documentation format, the “Standard DFMEA Form Sheet,” is also demonstrated.  In similar fashion to the classical DFMEA presentation of Vol. III, the content of each column of the form will be discussed in succession.  Review of classical FMEA is recommended prior to attempting the aligned process to ensure a baseline understanding of FMEA terminology.  Also, comparisons made between classical and aligned approaches will be more meaningful and, therefore, more helpful.
     The format of the aligned FMEA form is significantly different from that in classical FMEA.  It guides the user along the seven-step path, with the information required in each step organized in multiple columns.  Color-coding is used to correlate related information in each step.  The aligned “Standard DFMEA Form Sheet” (“Form A” in Appendix A of the AIAG/VDA FMEA Handbook) is reproduced in Exhibit 1.  For ease of presentation, the form is shown in a stacked format.  In use, however, corresponding information should be recorded in a single row.
     As has been done in previous installments of the “FMEA” series, portions of the form will be shown in close-up, with reference bubbles correlating to discussion in the text.  The goal is to facilitate learning the aligned process by maintaining the links between the 7-step approach, form sections, and individual columns where information is recorded.
 
Conducting an ‘Aligned’ Design FMEA
     Failure Modes and Effects Analysis is conducted in three “stages” – System Analysis, Failure Analysis and Risk Mitigation, and Risk Communication.  These three stages are comprised of the seven-step process mentioned previously.  A graphical representation of the relationships between the three stages and seven steps is shown in Exhibit 2.  For each step, brief reminders are provided of key information and activities required or suggested.  Readers should reference this summary diagram as each step is discussed in this presentation and while conducting an FMEA.
System Analysis
1st Step – Planning & Preparation
     Classical FMEA preparations have often been viewed as separate from analysis.  Recognizing the criticality of effective planning and preparation, the aligned process from AIAG and VDA formally incorporate them in the 7-step approach to FMEA.  This is merely a change in “status,” if you will, for preparatory activities.  Thus, the discussion in “Vol. II:  Preparing for Analysis” remains valid, though introduction of the tools is dispersed among steps in the aligned approach.
     Specifically, the realm of appropriate context for FMEA remains the three uses cases defined, briefly described as (1) new design, (2) modified design, or (3) new application.  Within the applicable context, the FMEA team must understand the level of analysis required – system, subsystem, or component.
     The core and extended analysis team members are chosen in the same manner as before.  Whereas classical FMEA defines four customers of concern, the aligned process focuses on two:  (1) assembly and manufacturing plants and (2) end users.  Defining customers in advance facilitates efficient, thorough analysis, with all foreseeable Effects of Failure identified.
     To state it generally, the inputs needed to conduct an effective FMEA have not changed.  However, the aligned process provides a framework for organizing the accumulated information into a coherent project plan.  Using the Five Ts structure introduced in “FMEA – Vol. V:  Alignment,” project information is presented in a consistent manner.  The scope of analysis, schedule, team members, documentation requirements, and more are recorded in a standardized format that facilitates FMEA development, reporting, and maintenance.
     Key project information is recorded on the DFMEA form in the header section labeled “Planning & Preparation (Step 1),” shown in Exhibit 3.  The labels are largely self-explanatory, though some minor differences exist between the aligned and classical forms.  One such difference is the definition of the “Subject” (1) of analysis.  On this line, identify the design analyzed by name, part number, and “nickname,” if commonly referred to by another identifier.  For example, an aircraft flight data recorder is commonly known as a “black box;” some automobiles have similar devices that may use the same moniker.
     Key Date on the classical form has been replaced by “Start Date” (2) on the aligned form; both refer to the design freeze date.  “Confidentiality Level” (3) has been added to the form.  Three levels are suggested in the Handbook – “Business Use,” “Proprietary,” and “Confidential” – but no further guidance on their application is provided.  Discussions should take place within an organization and with customers to ensure mutually agreeable use of these designations and information security.
     Refer to the “FMEA Form Header” section of “Vol. III:  ‘Classical’ Design Failure Modes and Effects Analysis” for additional guidance on completing this section of the form.
 
Continuous Improvement
     The first two columns in the body of the DFMEA form, shown in Exhibit 4, are not included in the discussion of the seven steps.  The first column, “Issue #” (A), is used to simply number entries for easy reference.  The purpose of column B, “History/Change Authorization,” is left to users’ interpretation.  Presumably, it is used to record management approval of design changes or acceptance of existing risk, though the Handbook offers no guidance on this topic.  In whatever fashion an organization chooses to use this portion of the form, it should be documented in training to ensure consistency and minimize confusion.
 
2nd Step – Structure Analysis
     In the Structure Analysis step, the design described in Step 1 is defined further by decomposing it into systems, subsystems, and components.  To visualize the scope of analysis, block diagrams and structure trees are used.  Only those elements over which the analysis team or FMEA owner (“Design Responsible”) exerts design control are within the scope of analysis, as identified by the boundaries of the diagrams.
     Interfaces may be in or out of scope, depending on the system in question.  Design responsibility must be clearly defined to prevent oversight of important performance characteristics and controls.
     Five “primary” interface types are delineated in the Handbook:  “physical connection,” “material exchange,” “energy transfer,” “data exchange,” and “human-machine.”  A sixth type, “physical clearance,” is mentioned separately, though it is of equal importance to the five primary types.
     In addition to the type, interface analysis must also define the strength and nature (e.g. positive/negative, advantageous/detrimental, etc.) of each interface, whether internal or external to the system.  Proper analysis and development of adequate controls requires a full complement of information.
     Structure Analysis is recorded on the DFMEA form in the section displayed in Exhibit 5.  In column C, the “highest level of integration” is identified as the “Next Higher Level.”  This is the highest level system within the scope of analysis, where the Effects of Failure will be noticed.  The “Focus Element,” named in column D, is the item in the failure chain for which Failure Modes, the technical descriptions of failures, are identified.  In column E, name the “Next Lower Level” of the structure hierarchy, where Causes of Failure will be found.
     Creation of a structure tree as a component of preparation for Structure Analysis is particularly helpful in populating the DFMEA form.  As can be seen in the example in Exhibit 6, the organization of information lends itself to direct transfer to the relevant section of the standard form.  Use of this tool can accelerate analysis and assure a comprehensive assessment of components.
     Enter information in the DFMEA form proceeding across (left to right), then down, to accommodate multiple Failure Modes of a single Focus Element.
 
3rd Step – Function Analysis
     Function Analysis is the next level in the FMEA information hierarchy; it closely parallels Structure Analysis.  As seen in Exhibit 7, the Function Analysis columns in the DFMEA form are numbered, labeled, and color-coded to create visual links to those in the Structure Analysis section.  In this section, the functions and requirements of the Focus Element and adjacent levels are recorded in the same pattern used in Step 2.  A Focus Element may serve multiple functions and have multiple requirements; each should be considered separately, recorded in its own row on the form to facilitate thorough analysis.
     Like the structure tree, information can be transferred directly from the function tree to the DFMEA form.  The function tree is constructed in the same manner as the structure tree in Step 2, replacing visual representations of components with descriptions of their contributions to the operation of the system.  An example function tree and information transfer to DFMEA are shown in Exhibit 8.
     Developing a parameter diagram can help organize analysis information and visualize influences on the system in operation.  Inputs and outputs, noise and control factors, functions, and requirements can be concisely presented in this format.  An example parameter diagram (P-Diagram) is shown in “Vol. II:  Preparing for Analysis;” another, from the AIAG/VDA FMEA Handbook, is shown in Exhibit 9.
Failure Analysis and Risk Mitigation
4th Step – Failure Analysis
     Failure Analysis is the heart of an FMEA, where a system’s failure network is established.  The links between Failure Modes, Effects of Failure, and Causes of Failure, in multiple levels of analysis, are made clear in this step.
     The definitions of Failure Mode, Effect of Failure, and Cause of Failure, in practice, remain the same as in classical FMEA.  Briefly, these are:
  • Failure Mode:  the manner in which the Focus Element could fail to meet a requirement or perform a function, described in technical terms.
  • Effect of Failure:  the result of a failure, as perceived by a customer.
  • Cause of Failure:  the reason the Failure Mode occurred, described in technical terms.
     The relationships between a Failure Mode, its cause, and the resultant effect creates a failure chain, as shown in Exhibit 10.  Each link in the failure chain is the consequence of the existence of the one to its right in the diagram.  That is, the Effect of Failure is the consequence, or result, of the occurrence of the Failure Mode, which is, in turn, the consequence of the existence of the Cause of Failure.
     Each element is not limited to a single failure chain.  The occurrence of a Failure Mode may be perceived in several ways (Effects of Failure) or have multiple potential causes.  A Cause of Failure may also result in several potential Failure Modes, and so on.
     Considering failure chains at multiple levels of analysis within a system defines the system’s failure network.  A Failure Mode at the highest level of integration becomes an Effect of Failure when analyzing the next lower level.  Similarly, a Cause of Failure is the Failure Mode in the next lower level of analysis.  This transformation, presented in Exhibit 11, continues through all levels of subsystem analysis to the component-level FMEAs.
     Each failure chain is documented in a single row of the DFMEA form, in the Failure Analysis section, shown in Exhibit 12.  The form maintains the relationships within the failure chain, as shown in Exhibit 10 and discussed above.  It also maintains the numbering and color-coding convention that links Failure Analysis to Function Analysis (Step 3) and Structure Analysis (Step 2), as shown in Exhibit 13.
     As is the case for the previous two steps, information can be transferred directly from the structure tree, were it created in advance, to the DFMEA form.  The Failure Analysis information transfer is shown in Exhibit 14.
     To complete the Failure Analysis step, the Effect of Failure must be evaluated and assigned a Severity (S) score.  To do this, consult the Severity criteria table, shown in Exhibit 15, and select the Severity score that corresponds to the applicable effect and criteria description.  Enter this number on the DFMEA form in the column labeled “Severity (S) of FE.”
     If there is uncertainty or disagreement about the appropriate Severity score to assign an Effect of Failure (e.g. “Is it a 3 or a 4?”), select the highest score being considered to ensure that the issue receives sufficient attention as development proceeds.  Confidence in evaluations and scores typically increase as a design matures; concordance often follows.
     The last column of the “DFMEA Severity Criteria Table” (Exhibit 15) is left blank in the Handbook because it is not universally applicable.  An organization can record its own examples of Effects of Failure to be used as comparative references when conducting a new FMEA or to aid in training.  Examples cited for various Severity scores can provide great insight into an organization’s understanding, or lack thereof, of the customer perspective.
 
5th Step – Risk Analysis
     The design team conducts Risk Analysis to identify the controls used to prevent or detect failures, evaluate their effectiveness, and prioritize improvement activities.  Relevant information is recorded in the Risk Analysis section of the DFMEA form, shown in Exhibit 16.
     In column F, record the Design Prevention Controls incorporated to preclude activation of the failure chain.  A vast array of prevention controls is available; however, it is likely that a very limited selection is applicable to any single Cause of Failure.  Cite only those expected to prevent the specific Cause of Failure with which they are correlated (i.e. in the same row on the form).
     Though not an exhaustive list of Design Prevention Controls, some examples follow:
  • design standard published by a manufacturer of purchased components, industry group, or other recognized authority
  • material specifications (e.g. ASTM)
  • re-use of previously validated design with history of successful service
  • use of safety factors
  • error-proofing (see “The War on Error – Vol. II”)
  • redundancy
  • electromagnetic shielding, thermal insulation, etc.
     The effectiveness of Design Prevention Controls is reflected in the Occurrence (O) score, recorded in column G of the DFMEA form.  To assign an Occurrence score to each Cause of Failure, consult the Occurrence criteria table shown in Exhibit 17.  Select the Occurrence score that corresponds to the criteria that best describes the design and application.
     If sufficient data is available to make reasonable predictions, a quantitative method of scoring can be used.  For this, substitute the alternate DFMEA Occurrence criteria table, shown in Exhibit 18, for the table of Exhibit 17.  The detailed criteria descriptions are unchanged; the “incidents per 1000” estimates simply replace the qualitative summary terms (high, low, etc.).
     If the frequency of occurrence of a Cause of Failure “falls between” two Occurrence scores, or there is disagreement about the correct frequency, select the higher Occurrence score.  Additional review in subsequent development cycles is likely to resolve the matter more efficiently than extensive debate in early stages.
     Both DFMEA Occurrence tables have a column left blank for organization-specific examples to be recorded.  These can be used as comparative references to facilitate future DFMEA development or as training aids.
 
     In column H of the Risk Analysis section of the DFMEA form, identify the Design Detection Controls in place to warn of the existence of the Failure Mode or Cause of Failure before the design is released to production.  Examples include endurance testing, interference analysis, designed experiments, proof testing, and electrical (e.g. “Hi-Pot”) testing.
     Assess the effectiveness of current detection controls according to the Detection criteria table, shown in Exhibit 19.  Select the Detection (D) score that corresponds to the Detection Method Maturity and Opportunity for Detection descriptions that most accurately reflect the state of the controls and enter it in column J.
     If there is a discrepancy between the Detection Method Maturity and Opportunity for Detection, for example, select the higher Detection score.  As a design matures, its controls evolve.  Review in subsequent development cycles is more efficient than dwelling on the matter in early stages.
     Like the S and O tables, the D criteria table has a column left blank for organization-specific examples.  Future DFMEA development and training may benefit from the experience captured here.
 
     Determining the priority of improvement activities in aligned FMEA is done very differently than the classical method.  For this purpose, the AIAG/VDA Handbook introduces Action Priority (AP), where a lookup table replaces the RPN calculation of classical FMEA.  The AP table is shown in Exhibit 20.
     The AP Table is used to assign a priority to improvement activities for a failure chain.  To do this, first locate the row containing the Severity score in the S column.  Then, in the O column, find the sub-row containing the Occurrence score, then the sub-row containing the Detection score in the D column.  In this sub-row, in the Action Priority (AP) column, one of three priorities is assigned:
  • H:  High Priority – actions must be taken to reduce risk or acceptance of the current design justified and approved.
  • M:  Medium Priority – risk should be reduced via improvement activities or acceptance of the current design justified and approved.
  • L:  Low Priority – improvement actions can be identified, but justification and approval are not required to accept the current design.
Record the Action Priority assigned in column K of the DFMEA form (Exhibit 16).  Development of improvements and assignment of responsibility, based on the priorities assigned, takes place in the next step.
     The final column of the AP Table, “Comments,” is left blank.  Organization-specific protocols, historical projects, acceptance authority, or other information can be cited here to assist DFMEA teams in completing analyses.
 
     Column L in Exhibit 16 is an optional entry.  “Filter Code” could be used in various ways.  Examples include:
  • As a substitute for the “Classification” column when transitioning an existing design from classical to aligned format.  The aligned DFMEA form does not have space reserved for special characteristic designations.
  • To assign categories to aid in resource management.  These could correspond to engineering specialties, such as structural, mechanical, electrical, human factors, and so on.  Subspecialties, such as lighting, haptics, etc., could also be coded, if desired.
  • Any creative use that assists the DFMEA owner in managing and completing the analysis or allows future analysis teams to easily search for similar issues in previous designs.
 
6th Step – Optimization
     In the Optimization step, improvement activities are identified to reduce risk, assigned to individuals for implementation, monitored for progress, and evaluated for effectiveness.  Relevant information is recorded in the Optimization section of the DFMEA form, shown in Exhibit 21.
     Preventive Actions (column M) are preferred to Detection Actions (column N); it is far better to eliminate an issue than to develop a better reaction to it.  The Handbook asserts that the most effective sequence of implementation is as follows:
  • Eliminate or mitigate Effects of Failure via design modification [lower Severity (S), Preventive Action]
  • Reduce the Occurrence (O) of Causes of Failure via design modification [Preventive Action]
  • Improve the ability to detect Causes of Failure or Failure Modes [lower Detection (D), Detection Action]
Prioritizing improvements in Severity over Occurrence and Detection is consistent with the classical FMEA approach.
     The Handbook suggests five possible statuses, to be recorded in column P, for the actions defined:
  • Open:  “No action defined.”
  • Decision pending:  action defined, awaiting approval.
  • Implementation pending:  action approved, awaiting implementation.
  • Completed:  implementation complete, effectiveness documented.
  • Not implemented:  action declined, current design accepted.
The example DFMEA form in the Handbook also shows a status of “planned;” a concise alternative to “implementation pending.”
     The remaining columns in the Optimization section of the DFMEA form are, essentially, carryovers from classical FMEA, with predicted AP replacing predicted RPN.  Refer to Vol. III for further discussion of these entries.
 
Risk Communication
7th Step – Results Documentation
     Much like Planning and Preparation, Results Documentation consists of tasks that have been performed for, but often considered separate from, classical FMEA.  The aligned approach formalizes Risk Communication as an essential component of analysis by incorporating it in the Seven-Step Approach.
     A significant portion of the content of an FMEA Report, as outlined in the Handbook, is contained in the DFMEA form.  For example, the scope of analysis, high-risk failures, action prioritization, status of actions, and planned implementation dates can be culled from the form.
     A comparison of the project plan, created in Step 1, with the execution and final status of the analysis may also be a useful component of the report.  Future analysis teams may be able to apply lessons learned from deviations revealed in this assessment.
     AIAG/VDA also suggest that a commitment to review and revise the DFMEA be included in the FMEA Report.  The discussion of classical DFMEA treated this as if it were a separate endeavor, providing further evidence of the expanse of AIAG/VDA’s efforts to create a thorough and consistent analysis process.  For further discussion on the topic, see “Review and Maintenance of the DFMEA” in FMEA – Vol. III:  “Classical” Design Failure Modes and Effects Analysis.
 
     A Design FMEA is a valuable development and communication tool.  It ensures that the impacts of a product design on customers and other stakeholders are given proper consideration.  It ensures that legal and regulatory requirements are met.  It also creates a record of development activity that can be used to refine a product, develop a derivative product, or train others to develop successful products.  Practitioners are encouraged to extract maximum value from FMEA and, in the effort, possibly discover a competitive advantage for their organization.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I:  Introduction to Failure Modes and Effects Analysis.
 
References
[Link] “Potential Failure Mode and Effects Analysis,” 4ed. Automotive Industry Action Group, 2008.
[Link] “FMEA Handbook.”  Automotive Industry Action Group and VDA QMC, 2019.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[FMEA – Vol. V:  Alignment]]>Wed, 18 May 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-v-alignment     Suppliers producing parts for automotive manufacturers around the world have always been subject to varying documentation requirements.  Each OEM (Original Equipment Manufacturer) customer defines its own requirements; these requirements are strongly influenced by the geographic location in which they reside.
     In an effort to alleviate confusion and the documentation burden of a global industry, AIAG (Automotive Industry Action Group) of North America and VDA (Verband der Automobilindustrie) of Germany jointly published the aligned “FMEA Handbook” in 2019.  Those experienced with “classical” FMEA (Vol. III, Vol. IV) will recognize its influence in the new “standard;” however, there are significant differences that require careful consideration to ensure a successful transition.
     A significant improvement to the typical presentation of classical FMEA is quickly revealed in the aligned standard.  Aspects of FMEA that were previously taken for granted, or understood through experience, are explicitly presented in the new Handbook.
     Discussion of the scope of analysis is expanded beyond defining a system, subsystem, or component FMEA to the type of risks to be considered.  As shown in Exhibit 1, FMEA is employed to investigate only technical risks.  Financial, time, and strategy risks are explicitly excluded from FMEA, though they remain important components of product and process development discussions.
     Limitations of FMEA that may require other analysis techniques to be employed are declared early in the presentation of the aligned standard.  Namely, FMEA is a qualitative single-point failure analysis technique.  Quantitative analysis and multi-point failures require other tools and techniques.  This discussion is relegated to the appendices in the previous AIAG Handbook.
     The aligned standard restates the requirement for management commitment of resources – time and expertise – to ensure successful FMEA development.  It is more explicit in its declaration that management is responsible for accepting the risks and risk mitigation strategies identified.  It does not point out, however, that management is also responsible for all risks that have not been identified.  Doing so would underline the value of training technical personnel in FMEA development to ensure thorough analysis and an accurate risk profile.
 
     Basic attributes desired of FMEA are often assumed to be understood, undermining previous Handbooks’ utility as an introduction to the methodology.  The aligned standard remedies this by defining the following four attributes sought:
  • Clear:  “[P]otential Failure Modes are described in technically precise, specific terms…” [emphasis added].  Normative statements and those with multiple interpretations are avoided.
  • True:  Accurate descriptions of Effects of Failure are provided; embellishment (“overstatement”) and trivialization (“understatement”) are eschewed.
  • Realistic:  “Reasonable” Causes of Failure are identified; “extreme events,” such as natural disasters, are not considered.  Foreseeable misuse (unintentional) is considered, but sabotage or other deliberate action is not.
  • CompleteAll potential Failure Modes and Causes of Failure are divulged.  Concern for creating a negative impression of a product or process does not justify withholding information.
 
The “New” Methodology
     The FMEA methodology prescribed in the AIAG/VDA aligned standard, in most regards, is not a radical departure from its predecessors, though important differences exist.  The steps followed and forms used will be familiar to practitioners of classical FMEA, though some terminology and formatting has changed.
     The AIAG/VDA FMEA Handbook defines a seven-step process for conducting an FMEA.  The seven steps are divided among three task categories, as shown in Exhibit 2.
     AIAG/VDA prescribes the “Five Ts” to guide FMEA planning discussions.  These topics should be discussed at the initiation of a project (i.e. in the “1st Step”).  The Five Ts are summarized below:
  • InTent:  Ensure that all members of the FMEA team understand its purpose and can competently contribute to its objectives.
  • Timing:  FMEA provides the greatest benefit when conducted proactively; that is, prior to finalizing or launching a product or process.  Recommended timing, aligned with APQP (Advanced Product Quality Planning) or MLA (Maturity Level Assurance) phases, is shown in Exhibit 3.
  • Team:  Identify the core and extended team members needed to successfully complete the FMEA.  Selecting members for a cross-functional team was discussed in Vol. II; a suggested roster is provided in the Handbook.
  • Tasks:  The 7 – Step FMEA Process, shown in Exhibit 2, provides a framework for ensuring all necessary tasks are performed.  Discussion can include detailed descriptions, assignment of responsibility, priority, etc.
  • Tools:  Identify the software package that will be used to document the FMEA and ensure that team members are proficient in its use.
 
     The first five steps of the FMEA process each provide the basis for the subsequent step.  That is, the information gathered in each step is used to complete the next step.  Each of the seven steps are discussed in greater detail in subsequent installments, each dedicated to one type of FMEA.
     The jointly-published FMEA Handbook is significantly expanded from its predecessors.  The information provided in this installment is only a preview and introduction to the approach presented by AIAG and VDA.  Upcoming installments of “The Third Degree” will distill the new Handbook into serviceable guides to aid experienced and novice FMEA practitioners alike.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I:  Introduction to Failure Modes and Effects Analysis.
 
References
[Link] “Potential Failure Mode and Effects Analysis,” 4ed. Automotive Industry Action Group, 2008.
[Link] “FMEA Handbook.”  Automotive Industry Action Group and VDA QMC, 2019.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[FMEA –Vol. IV:  “Classical” Process Failure Modes and Effects Analysis]]>Wed, 04 May 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-iv-classical-process-failure-modes-and-effects-analysis     Preparations for Process Failure Modes and Effects Analysis (Process FMEA) (see Vol. II) occur, in large part, while the Design FMEA undergoes revision to develop and assign Recommended Actions.  An earlier start, while ostensibly desirable, may result in duplicated effort.  As a design evolves, the processes required to support it also evolve; allowing a design to reach a sufficient level of maturity to minimize process redesign is an efficient approach to FMEA.
     In this installment of the “FMEA” series, how to conduct a “classical” Process FMEA (PFMEA) is presented as a close parallel to that of DFMEA (Vol. III).  Each is prepared as a standalone reference for those engaged in either activity, but reading both is recommended to maintain awareness of the interrelationship of analyses.
     The recommended format used to describe the classical PFMEA process is shown in Exhibit 1.  It is nearly identical to the DFMEA form; only minor terminology changes differentiate the two.  This is to facilitate sharing of information between analysis teams and streamline training efforts for both.  The form is shown in its entirety only to provide an overview and a preview of the analysis steps involved.  The form has been divided into titled sections for purposes of presentation.  These titles and divisions are not part of any industry standard; they are only used here to identify logical groupings of information contained in the FMEA and to aid learning.
     Discussion of each section will be accompanied by a close-up image of the relevant portion of the form.  The columns are identified by an encircled letter for easy reference to its description in the text.  The top of the form contains general information about the analysis; this is where the presentation of the “classical” Process FMEA begins.
 
FMEA Form Header
1) Check the box that best describes the scope of the PFMEA – does it cover an entire system, a subsystem, or a single component?  For example, analysis may be conducted on a Laundry system, a Wash subsystem, or a Spin Cycle component.
2) On the lines below the checkbox, identify the process system, subsystem, or component analyzed.  Include information that is unique to the process under scrutiny, including the specific technology used (e.g. resistance welding), product family, etc.
3) Process Responsibility:  Identify the lead engineer responsible for developing the process, and assuring and maintaining its performance.
4) Key Date:  Date of design freeze (testing complete, documented process approved by customer, etc.).
5) FMEA No.:  Provide a unique identifier to be used for FMEA documentation (this form and all accompanying documents and addenda).  It is recommended to use a coded system of identification to facilitate organization and retrieval of information.  For example, FMEA No. 2022 – J – ITS – C – 6 could be interpreted as follows:
            2022 – year of process launch,
            W – process family (e.g. welding),
            RES – process code (e.g. resistance),
            C – Component-level FMEA,
            6 – sixth component FMEA for the process “system.”
This is an arbitrary example; organizations should develop their own meaningful identification systems.
6) Page x of y.:  Track the entire document to prevent loss of content.
7) FMEA Date:  Date of first (“original”) FMEA release.
8) Rev.:  Date of release of current revision.

FMEA Form Columns
A – Process Step/Function
     List Process Steps by name or by description of purpose (Function).  Information recorded on a Process Flow Diagram (PFD) can be input directly in this column.  This list may consist of a combination of Process Step names and functional descriptions.  For example, a process name that is in wide use, but the purpose of which is not made obvious by the name, should be accompanied by a functional description.  This way, those that are not familiar with the process name are not put at a disadvantage when reviewing the FMEA.
B – Requirements
     Describe what a Process Step is intended to achieve, or the parameters of the Function (e.g. boil water requires “heat water to 100° C”)
C – Potential Failure Modes
     Describe how a Requirement could fail to be met.  A Process Step/Function may have multiple Failure Modes; each undesired outcome is defined in technical terms.  Opposite, or “mirror-image” conditions (e.g. high/low, long/short, left/right) should always be considered.  Conditional process failures should also be included, where applicable, though demand failures (when activated) and standby failures (when not in use) are more likely to delay processing than to effect a process’ performance.  Operational failures (when in use) require the greatest effort in analysis.
D – Potential Effects of Failure
     Describe the undesired outcome(s) from the customer perspective.  Effects of Failure may include physical damage, reduced performance, intermittent function, unsatisfactory aesthetics, or other deviation from reasonable customer expectations.  All “customers” must be considered, from internal to end user; packaging, shipping, installation and service crews, and others could be effected.
E – Severity (S)
     Rank each Effect of Failure on a predefined scale.  Loosely defined, the scale ranges from 1 – insignificant to 10 – resulting in serious bodily injury or death.  The suggested Severity evaluation criteria and ranking scale from AIAG’s “FMEA Handbook,” shown in Exhibit 7, provides guidelines for evaluating the impact of failures on both internal and external customers.  To evaluate nonautomotive designs, the criteria descriptions can be modified to reflect the characteristics of the product, industry, and application under consideration; an example is shown in Exhibit 8.
F – Classification
     Identify high-priority Failure Modes and Causes of Failure – that is, those that require the most rigorous monitoring or strictest controls.  These may be defined by customer requirements, empirical data, the lack of technology currently available to improve process performance, or other relevant characteristic.  Special characteristics are typically identified by a symbol or abbreviation that may vary from one company to another.  Examples of special characteristic identifiers are shown in Exhibit 9Special characteristics identified on a DFMEA should be carried over to any associated PFMEAs to ensure sufficient controls are implemented.
G – Potential Causes of Failure
     Identify the potential causes of the Failure Mode.  Like Failure Modes, Causes of Failure must be defined in technical terms, rather than customer perceptions (i.e. state the cause of the Failure Mode, not the Effect).  A Failure Mode may have multiple potential Causes; identify and evaluate each of them individually.
     Incoming material is assumed to be correct (within specifications); therefore, it is not to be listed as a Cause of Failure.  Incorrect material inputs are evidence of failures of prior processes (production, packaging, measurement, etc.), but their effect on the process should not be included in the PFMEA.  Also, descriptions must be specific.  For example, the phrase “operator error” should never be used, but “incorrect sequence followed” provides information that is useful in improving the process.
H – Prevention Controls
     Identify process controls used to prevent the occurrence of each Failure Mode or undesired outcome.  Process guidelines from material or equipment suppliers, simulation, designed experiments, statistical process control (SPC), and error-proofing (see “The War on Error – Vol. II”) are examples of Process Prevention Controls.
I – Occurrence (O)
     Rank each Failure Mode according to its frequency of occurrence, a measure of the effectiveness of the Process Prevention Controls employed.  Occurrence rating tables typically present multiple methods of evaluation, such as qualitative descriptions of frequency and quantitative probabilities.  The example in Exhibit 11 is from the automotive industry, while the one in Exhibit 12 is generalized for use in any industry.  Note that the scales are significantly different; once a scale is chosen, or developed, that elicits sufficient and appropriate responses to rankings, it must be used consistently.  That is, rankings contained in each PFMEA must have the same meaning.
     The potential Effects of Failure are evaluated individually via the Severity ranking, but collectively in the Occurrence ranking.  This may seem inconsistent, or counterintuitive, at first; however, which of the potential Effects will be produced by a failure cannot be reliably predicted.  Therefore, Occurrence of the Failure Mode must be ranked.  For a single Failure Mode, capable of producing multiple Effects, ranking Occurrence of each Effect would understate the significance of the underlying Failure Mode and its Causes.
J – Detection Controls – Failure Mode
     Identify process controls used to detect each Failure Mode.  Examples include various sensors used to monitor process parameters, measurements, post-processing verifications, and error-proofing in subsequent operations.
K – Detection Controls – Cause
     Identify process controls used to detect each Cause of Failure.  Similar tools and methods are used to detect Causes and Failure Modes.
L – Detection (D)
     Rank the effectiveness of all current Process Detection Controls for each Failure Mode.  This includes Detection Controls for both the Failure Mode and potential Causes of Failure.  The Detection ranking used for RPN calculation is the lowest for the Failure Mode.  Including the D ranking in the description for each Detection Control makes identification of the most effective control a simple matter.  Detection ranking table examples are shown in Exhibit 13 and Exhibit 14.  Again, differences can be significant; select or develop an appropriate scale and apply it consistently.
M – Risk Priority Number (RPN)
     The Risk Priority Number is the product of the Severity, Occurrence, and Detection rankings:  RPN = S x O x D.  The RPN column provides a snapshot summary of the overall risk associated with the design.  On its own, however, it does not provide the most effective means to prioritize improvement activities.  Typically, the S, O, and D rankings are used, in that order, to prioritize activities.  For example, all high Severity rankings (e.g. S ≥ 7) require review and improvement.  If a satisfactory process redesign cannot be developed or justified, management approval is required to accept the current process.
     Similarly, all high Occurrence rankings (e.g. O ≥ 7) require additional controls to reduce the frequency of failure.  Those that cannot be improved require approval of management to allow the process to operate in its existing configuration.  Finally, controls with high Detection rankings (e.g. D ≥ 6) require improvement or justification and management approval.
     Continuous improvement efforts are prioritized by using a combination of RPN and its component rankings, S, O, and D.  Expertise and judgment must be applied to develop Recommended Actions and to determine which ones provide the greatest potential for risk reduction.
     Requiring review or improvement according to threshold values is an imperfect practice.  It incentivizes less-conscientious evaluators to manipulate rankings, to shirk responsibility for improvement activities, process performance and, ultimately, product quality.  This is particularly true of RPN, as small changes in component rankings that are relatively easy to justify result in large changes in RPN.  For this reason, review based on threshold values is only one improvement step and RPN is used for summary and comparison purposes only.
N – Recommended Actions
     Describe improvements to be made to the process or controls to reduce risk.  Several options can be included; implementation will be prioritized according to evaluations of risk reduction.  Lowering Severity is the highest priority, followed by reducing frequency of Occurrence, and improving Detection of occurrences.
O – Responsibility
     Identify the individual who will be responsible for executing the Recommended Action, providing status reports, etc.  More than one individual can be named, but this should be rare; the first individual named is the leader of the effort and is ultimately responsible for execution.  Responsibility should never be assigned to a department, ad hoc team, or other amorphous group.
P – Target Completion Date
     Assign a due date for Recommended Actions to be completed.  Recording dates in a separate column facilitates sorting for purposes of status updates, etc.
Q – Actions Taken
     Describe the Actions Taken to improve the process or controls and lower the inherent risk.  These may differ from the Recommended Actions; initial ideas are not fully developed and may require adjustment and adaptation to successfully implement.  Due to limited space in the form, entering a reference to an external document that details the Actions Taken is acceptable.  The document should be identified with the FMEA No. and maintained as an addendum to the PFMEA.
R – Effective Date
     Document the date that the Actions Taken were complete, or fully integrated into the process.  Recording dates in a separate column facilitates FMEA maintenance, discussed in the next section.
S – Severity (S) – Predicted
     Rank each Effect of Failure as it is predicted to effect internal or external customers after the Recommended Actions are fully implemented.  Severity rankings rarely change; it is more common for a process change to eliminate a Failure Mode or Effect of Failure.  In such case, the PFMEA is revised, excluding the Failure Mode or Effect from further analysis.  Alternatively, the entry is retained, with S lowered to 1, to maintain a historical record of process development.
T – Occurrence (O) – Predicted
     Estimate the frequency of each Failure Mode’s Occurrence after Recommended Actions are fully implemented.  Depending on the nature of the Action Taken, O may or may not change.
U – Detection (D) – Predicted
     Rank the predicted effectiveness of all Process Detection Controls after Recommended Actions are fully implemented.  Depending on the nature of the Action Taken, D may or may not change.
     * Predictions of S, O, and D must be derived from the same scales as the original rankings.
V – Risk Priority Number (RPN) – Predicted
     Calculate the predicted RPN after Recommended Actions are fully implemented.  This number can be used to evaluate the relative effectiveness of Recommended Actions.
 
Review and Maintenance of the PFMEA
     Review of the PFMEA is required throughout the lifespan of the process.  As Recommended Actions are implemented, the results must be evaluated and compared to predictions and expectations.
     Performance of each Process Step or Function effected by Actions Taken must be reevaluated and new S, O, and D rankings assigned as needed.  Simply assuming that the predicted values have been achieved is not sufficient; the effectiveness of the Actions Taken must be confirmed.
     Once the effectiveness of Actions Taken has been evaluated, information from the Action Results section can be transferred to the body of the PFMEA.  Descriptions in the relevant columns are updated to reflect process development activity; new rankings replace the old.  New Recommended Actions can also be added to further improve the process’ risk profile.  The next round of improvements are prioritized according to the updated rankings and Recommended Actions.
     Maintenance of the PFMEA is the practice of systematically reviewing, updating, and reprioritizing activities as described above.  The PFMEA will undergo many rapid maintenance cycles prior to product launch.  Once the process is approved for continuous operation, PFMEA maintenance activity tends to decline sharply; it should not, however, cease, as long as the process remains in operation.  Periodic reviews should be used to evaluate the potential of new technologies, incorporate process and field data collected, and apply any knowledge gained since the process was approved.  Any process change must also initiate a review of the PFMEA sections pertaining to Process Steps or Functions effected by the revision.
 
     The PFMEA and all related documentation – diagrams and other addenda – remain valid and, therefore, require maintenance for the life of the process.  It serves as input to subsequent analyses and Quality documentation.  The PFMEA is an important component in the documentation chain of a successful product.  The investment made in a thoughtful analysis is returned manyfold during the lifecycle of the product, its derivatives, and any similar or related products.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I:  Introduction to Failure Modes and Effects Analysis.
 
References
[Link] “Potential Failure Mode and Effects Analysis,” 4ed. Automotive Industry Action Group, 2008.
[Link] Creating Quality.  William J. Kolarik; McGraw-Hill, Inc., 1995. 
[Link] The Six Sigma Memory Jogger II.  Michael Brassard, Lynda Finn, Dana Ginn, Diane Ritter; GOAL/QPC, 2002.
[Link] “FMEA Handbook Version 4.2.”  Ford Motor Company, 2011.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[FMEA –Vol. III:  “Classical” Design Failure Modes and Effects Analysis]]>Wed, 20 Apr 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-iii-classical-design-failure-modes-and-effects-analysis     In the context of Failure Modes and Effects Analysis (FMEA), “classical” refers to the techniques and formats that have been in use for many years, such as those presented in AIAG’s “FMEA Handbook” and other sources.  Numerous variations of the document format are available for use.  In this discussion, a recommended format is presented; one that facilitates a thorough, organized analysis.
     Preparations for FMEA, discussed in Vol. II, are agnostic to the methodology and document format chosen; the inputs cited are applicable to any available.  In this installment of the “FMEA” series, how to conduct a “classical” Design FMEA (DFMEA) is presented by explaining each column of the recommended form.  Populating the form columns in the proper sequence is only an approximation of analysis, but it is a very useful one for gaining experience with the methodology.
     The recommended format used to describe the classical Design FMEA process is shown in Exhibit 1.  There is no need to squint; it is only to provide an overview and a preview of the steps involved.  The form has been divided into titled sections for purposes of presentation.  These titles and divisions are not part of any industry standard; they are only used here to identify logical groupings of information contained in the FMEA and to aid learning.
     Discussion of each section will be accompanied by a close-up image of the relevant portion of the form.  The columns are identified by an encircled letter for easy reference to its description in the text.  The top of the form contains general information about the analysis; this is where the presentation of the “classical” Design FMEA begins.

FMEA Form Header
1) Check the box that best describes the scope of the DFMEA – does it cover an entire system, a subsystem, or a single component?
2) On the lines below the checkbox, identify the system, subsystem, or component analyzed.  Include relevant information, such as applicable model, program, product family, etc.
3) Design Responsible:  Identify the lead engineer responsible for finalizing and releasing the design.
4) Key Date:  Date of design freeze (design is “final”).
5) FMEA No.:  Provide a unique identifier to be used for FMEA documentation (this form and all accompanying documents and addenda).  It is recommended to use a coded system of identification to facilitate organization and retrieval of information.  For example, FMEA No. 2022 – J – ITS – C – 6 could be interpreted as follows:
            2022 – year of product launch,
            J – product family,
            ITS – product code,
            C – Component-level FMEA,
            6 – sixth component FMEA for the product.
This is an arbitrary example; organizations should develop their own meaningful identification systems.
6) Page x of y.:  Track the entire document to prevent loss of content.
7) FMEA Date:  Date of first (“original”) FMEA release.
8) Rev.:  Date of release of current revision.

FMEA Form Columns
A – Item/Function
     List (a) Items for hardware approach or (b) Functions for functional approach to conducting a Design FMEA.  A hardware approach can be useful during part-consolidation (i.e. design for manufacture or DFM) efforts; a functional approach facilitates taking a customer perspective.  A hardware approach is more common for single-function or “simple” designs with low part count, while a functional approach may be better suited to more complex designs with high part counts or numerous functions.  A hybrid approach, or a combination of hardware and functional approaches, is also possible.

B – Requirements
     Describe (a) an Item’s contribution to the design (e.g. screw is required to “secure component to frame”) or (b) a Function’s parameters (e.g. control temperature requires “45 ± 5° C”)
C – Potential Failure Modes
     Describe (a) the manner in which the Item could fail to meet a requirement or (b) the Function could fail to be properly executed.  An Item/Function may have multiple Failure Modes; each undesired outcome is defined in technical terms.  Opposite, or “mirror-image” conditions (e.g. high/low, long/short, left/right) should always be considered.  Conditional failures should also be included – demand failures (when activated), operational failures (when in use), and standby failures (when not in use) may have special causes that require particular attention.

D – Potential Effects of Failure
     Describe the undesired outcome(s) from the customer perspective.  Effects of Failure may include physical damage, reduced performance, intermittent function, unsatisfactory aesthetics, or other deviation from reasonable customer expectations.  All “customers” must be considered, from internal to end user; packaging, shipping, installation and service crews, and others could be effected.

E – Severity (S)
     Rank each Effect of Failure on a predefined scale.  Loosely defined, the scale ranges from 1 – insignificant to 10 – resulting in serious bodily injury or death.  The suggested Severity evaluation criteria and ranking scale from AIAG’s “FMEA Handbook” is shown in Exhibit 7.  To evaluate nonautomotive designs, the criteria descriptions can be modified to reflect the characteristics of the product, industry, and application under consideration; an example is shown in Exhibit 8.
F – Classification
     Identify high-priority Failure Modes and Causes of Failure – that is, those that require the most rigorous monitoring or strictest controls.  These may be defined by customer requirements, empirical data, the lack of technology currently available to improve design performance, or other relevant characteristic.  Special characteristics are typically identified by a symbol or abbreviation that may vary from one company to another.  Examples of special characteristic identifiers are shown in Exhibit 9.
G – Potential Causes of Failure
     Identify the potential causes of the Failure Mode.  Like Failure Modes, Causes of Failure must be defined in technical terms, rather than customer perceptions (i.e. state the cause of the Failure Mode, not the Effect).  A Failure Mode may have multiple potential Causes; identify and evaluate each of them individually.
H – Prevention Controls
     Identify design controls used to prevent the occurrence of each Failure Mode or undesired outcome.  Design standards (e.g. ASME Pressure Vessel Code), simulation, designed experiments, and error-proofing (see “The War on Error – Vol. II”) are examples of Design Prevention Controls.

I – Occurrence (O)
     Rank each Failure Mode according to its frequency of occurrence, a measure of the effectiveness of the Design Prevention Controls employed.  Occurrence rating tables typically present multiple methods of evaluation, such as qualitative descriptions of frequency and quantitative probabilities.  The example in Exhibit 11 is from the automotive industry, while the one in Exhibit 12 is generalized for use in any industry.  Note that the scales are significantly different; once a scale is chosen, or developed, that elicits sufficient and appropriate responses to rankings, it must be used consistently.  That is, rankings contained in each DFMEA must have the same meaning.
     The potential Effects of Failure are evaluated individually via the Severity ranking, but collectively in the Occurrence ranking.  This may seem inconsistent, or counterintuitive, at first; however, which of the potential Effects will be produced by a failure cannot be reliably predicted.  Therefore, Occurrence of the Failure Mode must be ranked.  For a single Failure Mode, capable of producing multiple Effects, ranking Occurrence of each Effect would understate the significance of the underlying Failure Mode and its Causes.

J – Detection Controls – Failure Mode
     Identify design controls used to detect each Failure Mode.  Examples include design reviews, simulation, mock-up and prototype testing.

K – Detection Controls – Cause
     Identify design controls used to detect each Cause of Failure.  Examples include validation and reliability testing and accelerated aging.

L – Detection (D)
     Rank the effectiveness of all current Design Detection Controls for each Failure Mode.  This includes Detection Controls for both the Failure Mode and potential Causes of Failure.  The Detection ranking used for RPN calculation is the lowest for the Failure Mode.  Including the D ranking in the description for each Detection Control makes identification of the most effective control a simple matter.  Detection ranking table examples are shown in Exhibit 13 and Exhibit 14.  Again, differences can be significant; select or develop an appropriate scale and apply it consistently.
M – Risk Priority Number (RPN)
     The Risk Priority Number is the product of the Severity, Occurrence, and Detection rankings:  RPN = S x O x D.  The RPN column provides a snapshot summary of the overall risk associated with the design.  On its own, however, it does not provide the most effective means to prioritize improvement activities.  Typically, the S, O, and D rankings are used, in that order, to prioritize activities.  For example, all high Severity rankings (e.g. S ≥ 7) require review and improvement.  If an appropriate redesign cannot be developed or justified, management approval is required to accept the current design.
     Similarly, all high Occurrence rankings (e.g. O ≥ 7) require additional controls to reduce the frequency of failure.  Those that cannot be improved require approval of management to allow the design to enter production.  Finally, controls with high Detection rankings (e.g. D ≥ 6) require improvement or justification and management approval.
     Continuous improvement efforts are prioritized by using a combination of RPN and its component rankings, S, O, and D.  Expertise and judgment must be applied to develop Recommended Actions and to determine which ones provide the greatest potential for risk reduction.
     Requiring review or improvement according to threshold values is an imperfect practice.  It incentivizes less-conscientious evaluators to manipulate rankings, to shirk responsibility for improvement activities and, ultimately, product performance.  This is particularly true of RPN, as small changes in component rankings that are relatively easy to justify result in large changes in RPN.  For this reason, review based on threshold values is only one improvement step and RPN is used for summary and comparison purposes only.
N – Recommended Actions
     Describe improvements to be made to the design or controls to reduce risk.  Several options can be included; implementation will be prioritized according to evaluations of risk reduction.  Lowering Severity is the highest priority, followed by reducing frequency of Occurrence, and improving Detection of occurrences.

O – Responsibility
     Identify the individual who will be responsible for executing the Recommended Action, providing status reports, etc.  More than one individual can be named, but this should be rare; the first individual named is the leader of the effort and is ultimately responsible for execution.  Responsibility should never be assigned to a department, ad hoc team, or other amorphous group.
 
P – Target Completion Date
     Assign a due date for Recommended Actions to be completed.  Recording dates in a separate column facilitates sorting for purposes of status updates, etc.
Q – Actions Taken
     Describe the Actions Taken to improve the design or controls and lower the design’s inherent risk.  These may differ from the Recommended Actions; initial ideas are not fully developed and may require adjustment and adaptation to successfully implement.  Due to limited space in the form, entering a reference to an external document that details the Actions Taken is acceptable.  The document should be identified with the FMEA No. and maintained as an addendum to the DFMEA.

R – Effective Date
     Document the date that the Actions Taken were complete, or fully integrated into the design.  Recording dates in a separate column facilitates FMEA maintenance, discussed in the next section.

S – Severity (S) – Predicted
     Rank each Effect of Failure as it is predicted to effect the customer after the Recommended Actions are fully implemented.  Severity rankings rarely change; it is more common for a design change to eliminate a Failure Mode or Effect of Failure.  In such case, the DFMEA is revised, excluding the Failure Mode or Effect from further analysis.  Alternatively, the entry is retained, with S lowered to 1, to maintain a historical record of development.

T – Occurrence (O) – Predicted
     Estimate the frequency of each Failure Mode’s Occurrence after Recommended Actions are fully implemented.  Depending on the nature of the Action Taken, O may or may not change.

U – Detection (D) – Predicted
     Rank the predicted effectiveness of all Design Detection Controls after Recommended Actions are fully implemented.  Depending on the nature of the Action Taken, D may or may not change.
     * Predictions of S, O, and D must be derived from the same scales as the original rankings.

V – Risk Priority Number (RPN) – Predicted
     Calculate the predicted RPN after Recommended Actions are fully implemented.  This number can be used to evaluate the relative effectiveness of Recommended Actions.
 
Review and Maintenance of the DFMEA
     Review of the DFMEA is required throughout the product development process.  As Recommended Actions are implemented, the results must be evaluated and compared to predictions and expectations.
     Performance of each Item or Function effected by Actions Taken must be reevaluated and new S, O, and D rankings assigned as needed.  Simply assuming that the predicted values have been achieved is not sufficient; the effectiveness of the Actions Taken must be confirmed.
     Once the effectiveness of Actions Taken has been evaluated, information from the Action Results section can be transferred to the body of the DFMEA.  Descriptions in the relevant columns are updated to reflect design modifications and new rankings replace the old.  New Recommended Actions can also be added to further improve the design’s risk profile.  The next round of improvements are prioritized according to the updated rankings and Recommended Actions.
     Maintenance of the DFMEA is the practice of systematically reviewing, updating, and reprioritizing activities as described above.  The DFMEA will undergo many rapid maintenance cycles during product development.  Once the design is approved for production, DFMEA maintenance activity tends to decline sharply; it should not, however, cease, as long as the product remains viable.  Periodic reviews should take place throughout the product lifecycle to evaluate the potential of new technologies, incorporate process and field data collected, and apply any knowledge gained since the product’s introduction.  Any design change must also initiate a review of the DFMEA sections pertaining to Items or Functions effected by the revision.
 
     The DFMEA and all related documentation – diagrams and other addenda – remain valid and, therefore, require maintenance for the life of the product.  It serves as input to the Process FMEA, other subsequent analyses, Quality documentation, and even marketing materials.  The DFMEA is an important component in the documentation chain of a successful product.  The investment made in a thoughtful analysis is returned manyfold during the lifecycles of the product, its derivatives, and any similar or related products.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I:  Introduction to Failure Modes and Effects Analysis.
 
References
[Link] “Potential Failure Mode and Effects Analysis,” 4ed. Automotive Industry Action Group, 2008.
[Link] Product Design.  Kevin Otto and Kristin Wood.  Prentice Hall, 2001.
[Link] Creating Quality.  William J. Kolarik; McGraw-Hill, Inc., 1995. 
[Link] The Six Sigma Memory Jogger II.  Michael Brassard, Lynda Finn, Dana Ginn, Diane Ritter; GOAL/QPC, 2002.
[Link] “FMEA Handbook Version 4.2.”  Ford Motor Company, 2011.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[FMEA – Vol. II:  Preparing for Analysis]]>Wed, 06 Apr 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-ii-preparing-for-analysis     Prior to conducting a Failure Modes and Effects Analysis (FMEA), several decisions must be made.  The scope and approach of analysis must be defined, as well as the individuals who will conduct the analysis and what expertise each is expected to contribute.
     Information-gathering and planning are critical elements of successful FMEA.  Adequate preparation reduces the time and effort required to conduct a thorough FMEA, thereby reducing lifecycle costs, as discussed in Vol. I.  Anything worth doing is worth doing well.  In an appropriate context, conducting an FMEA is worth doing; plan accordingly.
     To verify that there is an appropriate context for analysis, consider the three basic FMEA use cases:
1) Analysis of a new technology, service, process, or product design.
2) Analysis of modifications to an existing product, service, or process.
3) Analysis of an existing product, service, process, or technology placed in service in a new environment or application or with new or modified requirements.
     Each use case also defines a generalized scope of analysis.  In use case 1, the scope includes the entire product, service, process, or technology in development.  For use case 2, analysis is limited to the modifications and new characteristics, interactions, or interfaces created by them.  Analysis in use case 3 focuses on interactions with a new environment, requirements created by a new application, or other influences of new operating parameters.
     A detailed definition of scope should be drafted during analysis planning.  The product, process, etc., modifications, or new operating parameters to be reviewed should be clearly defined to ensure a proper analysis.
     Clearly defining the scope accelerates the identification of requisite expertise to conduct the FMEA.  This, in turn, expedites the identification of individuals to be invited to join a cross-functional analysis team.  The team may consist of “core” members and “extended” team members.  Core members, including the engineer or project manager with overall responsibility for the FMEA, participate in all analysis activities.  Extended team members, in contrast, join the discussions for portions of the analysis to which their specialized knowledge and experience is particularly helpful; they are then dismissed to their regular duties.  The core team members are “full-timers,” while the extended team members are “part-timers” with respect to the FMEA.
     Examples of the roles or departments that team members may be recruited – or conscripted! – from are shown in Exhibit 1.  Team members should be individuals, identified by name, membership type (e.g. core or extended), and expertise.  Individuals must be accountable for the analysis; responsibilities assigned to departments are too easily shirked!
     Additional guidance on the selection of team members can be found in Making Decisions – Vol. IV:  Fundamentals of Group Decision-Making.  Adapting the models presented in the section titled “Who should be in the decision-making group?” to the FMEA context can help build an efficient analysis team.
     Once the team has been convened, its collective expertise can be applied to the remaining preparation steps, the first of which is to identify customers.  AIAG has identified four major customers that must be considered throughout the analysis:
1) Suppliers – design decisions, process requirements, and quality standards effect suppliers’ ability to reliably deliver compliant components and services.
2) OEM manufacturing and assembly plants – the confluence of all components, functions, and interfaces that must be seamlessly compiled into an end product.
3) Regulators – failing to comply with regulatory requirements could doom any product or service offering or the provider’s operations.
4) End users – the individuals or organizations that will directly engage with a product or service.
     Considering each of these customers will lead to a more comprehensive and accurate inventory of functional requirements, failure modes, and effects.  A thorough analysis is required to develop effective controls, corrective actions, and performance predictions.
 
     Functional presentations of the information gathered during preparations are required for it to add value to the analysis.  Block diagrams, parameter (P) diagrams, schematics, bills of materials (BOMs), and interface diagrams are all valid presentation tools that support FMEA.
     Block diagrams can be developed in several formats; two examples are shown in Exhibit 2.  The purpose of a block diagram is to provide a graphical representation of the interactions of components within the scope of analysis.  As shown in the generic black box model in Exhibit 3, interactions may include transfers of material, energy, or information; an example application is shown in Exhibit 4.
     The block diagram shown in Exhibit 5 is the result of expanding the black box model of Exhibit 4 to a level of detail that can be transferred to a Design FMEA (DFMEA).  The expected magnitude of interactions – force, torque, current, etc. – and types of information to be transmitted are the functional requirements recorded in the FMEA.
     A parameter diagram, or P-Diagram, can be used to compile information describing the controlled and uncontrolled inputs to a design and its desired and undesired outputs.  This will identify control factors, noise, and “error states,” or failure modes, that can be imported into a DFMEA.  See Exhibit 6 for an example P-Diagram.
     Any information source available can be used to support an FMEA, including other analyses that have been conducted independently.  For example, design for manufacturability, assembly, service, etc. (DFX) studies, ergonomic evaluations, and reliability data from similar products will contain information relevant to the DFMEA.
     Prior to conducting a Process FMEA (PFMEA), a detailed Process Flow Diagram (PFD) should be created, covering the entire scope of analysis.  It is often at this stage that the analysis team realizes that the previously chosen scope of analysis is too broad, poorly defined, or otherwise ill-suited to an efficient PFMEA process.  Adjusting the scope and supporting documentation at this stage is much less costly than a poorly-executed PFMEA.
     Process Flow Diagrams can also be created in various formats, ranging from a simple flow chart (See Commercial Cartography – Vol. II:  Flow Charts) to a detailed input-process-output model.  An example PFD is shown in Exhibit 7.  A D•I•P•O•D Process Model can be used to supplement a simple flow chart or to populate a more-detailed diagram.  The DFMEA, historical data from similar products or processes, and other analyses should also be used in support of a PFMEA process.
 
     Preparation for Failure Modes and Effects Analysis need not be limited to the activities and tools discussed here.  Anything that supports a thorough and accurate analysis should be employed.  The broader range of reliable information used, the more successful the analysis, as can be judged by the flattened lifecycle cost curve shown in Vol. I.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “FMEA” volumes on “The Third Degree,” see Vol. I:  Introduction to Failure Modes and Effects Analysis.
 
References
[Link] “Potential Failure Mode and Effects Analysis,” 4ed. Automotive Industry Action Group, 2008.
[Link] Product Design.  Kevin Otto and Kristin Wood.  Prentice Hall, 2001.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[FMEA – Vol I:  Introduction to Failure Modes and Effects Analysis]]>Wed, 23 Mar 2022 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/fmea-vol-i-introduction-to-failure-modes-and-effects-analysis     Failure Modes and Effects Analysis (FMEA) is most commonly used in product design and manufacturing contexts.  However, it can also be helpful in other applications, such as administrative functions and service delivery.  Each application context may require refinement of definitions and rating scales to provide maximum clarity, but the fundamentals remain the same.
     Several standards have been published defining the structure and content of Failure Modes and Effects Analyses (FMEAs).  Within these standards, there are often alternate formats presented for portions of the FMEA form; these may also change with subsequent revisions of each standard.
     Add to this variety the diversity of industry and customer-specific requirements.  Those unbeholden to an industry-specific standard are free to adapt features of several to create a unique form for their own purposes.  The freedom to customize results in a virtually limitless number of potential variants.
     Few potential FMEA variants are likely to have broad appeal, even among those unrestricted by customer requirements.  This series aims to highlight the most practical formats available, encouraging a level of consistency among practitioners that maintains Failure Modes and Effects Analysis as a portable skill.  Total conformity is not the goal; presenting perceived best practices is.
     Automotive Industry Action Group (AIAG) and Verband der Automobilindustrie (VDA) have begun consolidation by jointly publishing the “FMEA Handbook” in 2019.  This publication aligns North American and German automotive industry practice in a step towards global harmonization of FMEA requirements.  In this series, we will attempt to distill the new handbook into user-friendly guides for both new and transitioning practitioners.
     To ensure clarity of the inevitable references and comparisons to previous FMEA guidelines, we will review “classical” FMEA before discussing the new “aligned” standard in detail.  Implementation of FMEA, in North America at least, has been driven primarily by the automotive industry.  That influence will be evident here, as the presentation of classical FMEA is derived, in large part, from previous editions of the AIAG FMEA Handbook.

Costs of FMEA Implementation
     The financial impact of conducting an FMEA is difficult to accurately predict.  It is dependent upon the product and processes involved and the management styles of those responsible for overseeing them.  Likewise, only the initial cost of not conducting and FMEA – zero – can be known in advance.
     We do know, however, at least qualitatively, how the cost of conducting FMEA compares to foregoing analysis.  To be more precise, we can compare the lifecycle cost of a product with a well-executed FMEA to one with poorly-executed or eschewed analysis.  This comparison is depicted in Exhibit 1; the green line represents the cost curve of a hypothetical product lifecycle when FMEA are well-executed.  The red line represents a comparable product lifecycle without well-executed FMEA.
     As can be seen in Exhibit 1, there are costs incurred in the product and process development stages when FMEA is conducted in earnest.  This early investment flattens the curve in later stages, however.  In contrast, the cost benefit of neglecting FMEA in the early stages is eclipsed by the costs incurred as issues are discovered during launch.  The lack of proactive analysis continues to incur additional costs throughout production and beyond; even after a product is retired, lingering liabilities may continue to plague the manufacturer.  In the well-executed case, lessons learned may be applied to new programs, effectively lowering the cost of the FMEA effort.
     Note that the “neglecting FMEA” scenario is shown to have non-zero cost in the development stages; even “box-checking exercises” incur some cost.  Also, an alternative way to conceive of lowering the cost of FMEA may be more intuitive for some readers:  applying lessons learned to future programs increases the return on the investment made in FMEA implementation.
     When greater effort is expended on proactive FMEA, the lifecycle cost of a product can be estimated earlier in the cycle and with greater reliability.  Without proactive analysis, the total cost of a program will not be known until the product is retired and all outstanding issues are resolved.
     Proper FMEA implementation can be threatened by the flawed logic that endangers all proactive efforts.  In a cost-cutting frenzy, an unenlightened manager may divert resources from proactive FMEA efforts.  This failure to recognize the benefits will shift the subject program from the green line to the red line in Exhibit 1, increasing total cost.  As proactive efforts – FMEA or others – become more effective, the need for them becomes less salient to many decision-makers.  Practitioners can counter this by making reference to these efforts and quantifying, to the extent possible, the costs avoided throughout the product lifecycle as a result of the commitment to early and ongoing analysis.
 
Other Reasons for FMEA
     Direct costs of neglecting to conduct a proper FMEA were discussed first for one simple reason:  direct costs get the most attention.  Now that I have yours, there are other reasons to consider investing in proactive analysis.  Ultimately, each can be associated with costs, but the links are not always obvious.  These indirect costs, however, can also be “make or break” factors for a product or company.
     Customer satisfaction or, more broadly, customer experience, is determined, in large part, in the product development stage.  Launching a product without incorporating “the voice of the customer” or the end-user perspective can lead to various perceived failures.  Customer experience deficiencies exist in addition to “traditional” failures, such as physical damage or product malfunction.  Issues ranging from minor inconvenience or dissatisfaction to physical injury and property damage can be prevented with proper FMEA implementation.
     Failures that cause physical injury or property damage can result in extensive legal and financial liabilities.  The great expense of conducting a product recall is only the beginning of such a nightmare scenario.  The ability to present a considered evaluation of product risks and evidence of mitigation efforts can reduce awarded damages and other claims against a manufacturer.
     Analysis in the process development stage can lead to improvements for both customers and the producer.  Reducing process variation generates increased customer satisfaction by providing a more consistent product.  Producer costs are simultaneously reduced by minimizing scrap, rework, and testing.
     Other producer benefits may include the identification of potential productivity and ergonomic improvements.  “Lean wastes,” such as excessive handling and transportation, or overprocessing may also be identified as causes of reduced product performance.  Eliminating these issues prior to series production improves quality and customer satisfaction while increasing efficiency.
     Proactive analysis facilitates maintaining a project schedule with a smoother launch and target market introduction dates achieved.  The costs of neglecting analysis in early stages of the product lifecycle cannot be recovered.  Future performance can be improved, but recovering from a damaged reputation may be far more difficult.  Costs will continue to accrue as discounting or other promotional means may be necessary to regain a position in the market.
     Other benefits may also accrue.  The value of generating a deeper understanding of a product or process is difficult to quantify.  The value of developing confidence among executives in a product’s potential, however, is often quantified in budgets and salaries.  Financial or otherwise, proactive analysis pays dividends.
 

     The FMEA series will cover a broad range of related topics.  If there is a specific topic you would like to see covered, or question answered, feel free to send suggestions.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
 
Directory of “FMEA” entries on “The Third Degree.”
Vol. I:  Introduction to Failure Modes and Effects Analysis (23Mar2022)
Vol. II:  Preparing for Analysis (6Apr2022)
Vol. III:  “Classical” Design Failure Modes and Effects Analysis (20Apr2022)
Vol. IV:  “Classical” Process Failure Modes and Effects Analysis (4May2022)
Vol. V:  Alignment (18May2022)
Vol. VI:  Aligned Design FMEA
Vol. VII:  Aligned Process FMEA

Related post:  “P” is for “Process” (FMEA) (27Jun2018)
]]>
<![CDATA[Managerial Schizophrenia and Workplace Cancel Culture]]>Wed, 09 Mar 2022 15:30:00 GMThttp://jaywinksolutions.com/thethirddegree/managerial-schizophrenia-and-workplace-cancel-culture     Destructive behaviors existed in organizations long before they were given special names.  The term “cancel culture” is not typically associated with business environments, but its pernicious effects are prevalent.  Unlike a boycott, cancel culture destroys an organization from within, through covert and fraudulent actions.
     Cancel culture effects all levels of an organization, but “managerial schizophrenia” is a common precursor and potent ingredient.  Adverse behaviors signal abandonment of cultural and professional norms, the subsequent failures of collaboration, and the resultant degradation in group performance.  Combatting these intertwined organizational cancers requires commitment from all levels of management and revised methods of oversight.
Workplace Cancel Culture
     Cancel culture, as a concept, is most commonly associated with public figures, such as politicians, “journalists” and pundits, comedians, and podcasters.  However, it has also existed in many workplaces, unnamed, far longer than it has been the subject of public debate.  The attempted “cancellation” of a public figure gets a lot of attention, as purveyors of all forms of media use it to their advantage.  Workplace cancellations are far more damaging – to individuals and organizations – and more common, yet receive little, if any, attention from anyone not directly affected.
     Dictionary.com defines cancel culture as “the phenomenon or practice of publicly rejecting, boycotting, or ending support for particular people or groups because of their socially or morally unacceptable views or actions.”  This definition is rather generous, as overwrought public outrage is often due to a mere difference of opinion and is not genuine, but purposeful.  The public drama is leveraged to suppress or discredit differing, though often legitimate, viewpoints.
     The generosity of dictionary.com and the lack of reference to professional pursuits renders the generic definition of “cancel culture” unsatisfying for our discussion of workplace issues.  To remedy its shortcomings, the following definition is proposed:

Workplace Cancel Culture – the phenomenon or practice of resisting or misrepresenting the efforts, results, or viewpoint of a colleague in order to achieve or maintain a favorable position in the organization.

     “Office politics,” another longstanding malady of the workplace, involves currying favor with executives and other attempts by individuals to complete a zero-sum “game” on the positive side of the ledger.  Such zero-sum games include the apportionment of bonuses, promotions, plush assignments, larger offices, and the like.
     Workplace cancel culture goes beyond office politics; rather than simply trying to advance themselves, individuals begin to attack others, attempting to damage the reputations and future prospects of perceived rivals.  The logic seems to be this:  if person A is perceived as less competent or effective than previously thought, person B will be perceived as relatively more competent or effective, irrespective of person B’s actual performance.  If discrediting person A is easier for person B to achieve than is improving his/her own performance, this is the route person B chooses.  This logic is, of course, terribly flawed; these individuals are playing a zero-sum game when none exists!
     This flawed logic is applied in various situations at and between all levels of organizations.  Peers may be competing for a promotion, transfer, or simply a higher status within a stable group.  A promotion-seeker may also attempt to create a vacancy by discrediting the incumbent of the position s/he desires.  Conversely, a manager may protect his/her own career trajectory by interfering with that of a promising subordinate.
     To do this, an ethically-challenged manager has several mechanisms available to him/her.  Misrepresenting a subordinate’s performance or “cultural fit” to other managers and influential people establishes justification for his/her eventual dismissal, eliminating the threat to the manager’s progression within the organization or other aspirations.  Terminating an employee is the ultimate form of “deplatforming,” to use the current vernacular, in the workplace context.
     Subjecting an individual to unreasonable expectations or standards of performance lends credibility to the manager’s reports of “unsatisfactory performance.”  An "accurate" report of “failed to meet expectations” obfuscates the reality of the situation; willfully unaware decision-makers will support the corrupt manager’s chosen course of action.
     If a modicum of subtlety is desired, a manager that feels threatened by a high-performing subordinate may fabricate justifications for the denial of raises, bonuses, high-visibility assignments, or other perks.  Such actions are often less salient to upper management while encouraging high-performers to leave the organization.  Credit for an individual’s results may also be shifted to another, less threatening, individual, further limiting the high-performer’s prospects and increasing the incentive to seek opportunities in a higher-integrity environment.
     Once workplace cancel culture is allowed to develop in an organization, it creates a vicious cycle in which only those who specialize in cancel culture and office politics can thrive.  High-performing, inquisitive individuals that reject the status quo and “we’ve always done it this way” mentality are driven out.  How this happens and what must be done to counter it will be discussed further in a later section.
 
Managerial Schizophrenia
     According to dictionary.com, schizophrenia is “a state characterized by the coexistence of contradictory or incompatible elements.”  Managerial schizophrenia refers to the coexistence of contradictory or incompatible elements in the context of a hierarchical organizational structure; it is akin to cognitive dissonance.  Schizophrenic managers espouse inclusion, respect, creativity, meritocracy, and earnest communication while acting in contradiction to these ideals.
     Managers’ schizophrenia can vary by degree; in its mildest form, a manager condones the deviant behaviors of others without actively participating oneself.  The passive form is unlikely to persist; untreated cancers become more malignant.
     Schizophrenic managers often exhibit behaviors that signify his/her social dominance orientation (SDO) – the belief that his/her position is evidence of his/her superiority or entitlement – such as ostracism, isolation, or other mistreatment of colleagues.  Unconsidered rejection of novel ideas and obfuscation, withholding, or distortion of information are tools of self-protection in common use that discourage creativity, diminish respect, and erode trust.
     When schizophrenic managers flourish, it is a harbinger of a developing cancel culture within the organization.  This is so because managerial schizophrenia is a potent ingredient of workplace cancel culture and a self-feeding monster with no other development path available to it.
 
How Workplace Cancel Culture Develops
     The simplest explanation of the growth of workplace cancel culture is “monkey see, monkey do.”  As subordinates recognize the actions that have propelled their managers’ careers, they identify these destructive behaviors as keys to their own success and longevity.  The focus then shifts from collaboration, learning, and performance to ingratiation to superiors (the “coattails” method of advancement) and sabotage of “competitors,” a.k.a. peers.
     As broken windows theory tells us, ignoring bad behavior encourages its escalation.  A corrupt manager’s position and lack of intervention by upper management reinforces his/her belief that his/her decisions are sound and actions and behaviors are appropriate.  This is the basis for motivated reasoning, wherein information is reframed or selectively accepted to maintain consistency between one’s decisions and beliefs.  In other words, decisions that support the decision-maker’s beliefs or desired outcomes are rationalized, or justified, by shaping information into a narrative that reinforces the decision.
     Over time, the aberrant behavior becomes easier to mask, justify, and perpetrate on the unsuspecting.  As free-thinkers and high-performers are driven out, the organization becomes more aligned, more schizophrenic.  Thus, the vicious cycle repeats and the monster continues to feed.
 
     The preceding discussions of managerial schizophrenia and workplace cancel culture have overlapped.  As mentioned at the outset, they are intertwined; however, some clarification may be in order.  Managerial schizophrenia is an individual phenomenon; it affects individual managers independent of the behavior of others.  When it spreads to the degree of a septic infection, managerial schizophrenia becomes a powerful building block of workplace cancel culture.
     Workplace cancel culture is a more generalized term.  It implies the existence of managerial schizophrenia as well as other aberrant behaviors.  These may include accepting an amount of credit or blame that is uncommensurate with one’s efforts or results.  That is, taking more credit than deserved for a favorable outcome, or blaming others for underperformance.
     Accusations of inappropriate behavior are common tools of cancellers.  Even unsubstantiated claims of impropriety can irreparably harm a person’s reputation; it is one of the most treacherous tactics that can be employed.  A lack of repercussions for false claims is a clear sign that a dangerous, toxic culture has metastasized in an organization.
     In summary, any tactic or scheme that a devious mind could fabricate to suppress or silence opposing viewpoints, curry favor with superiors, damage the credibility of potential challengers, or in any way elevate oneself at the expense of others or the organization that is tolerated by the organization’s leadership falls under the umbrella of workplace cancel culture.
 
What to Do About Workplace Cancel Culture
     Any discussion about how to counter managerial schizophrenia and workplace cancel culture should center on three key points:
  1. For workplace cancel culture to be eliminated, managerial schizophrenia must be eradicated.
  2. It is not unethical to pursue advancement, but how one pursues it is of great import.
  3. It is much easier to develop a culture than it is to change one.
Point (1) encapsulates much of the previous discussion and, therefore, requires no additional comment here.  Point (2) should be obvious, though it bears repeating.
     Point (3) is well-known, but frequently ignored by leaders to the great detriment of their organizations and the individuals within.  A desired culture should be actively developed; otherwise, neglect will allow one, usually far less desirable, to develop on its own.  An organization’s strongest personalities will define the culture; when coupled with weak characters, the negative elements are bound to prevail.
     Preventing workplace cancel culture from taking root requires the active participation of all levels of management.  The CEO must set standards and expectations that filter down through the organization; intensive follow-up is required to ensure that standards of conduct are upheld with consistency.
     Credit must be accurately apportioned and performance appropriately rewarded.  Whether on a project-by-project basis, in an annual review, or other assessment scheme, validation mechanisms should be in place to ensure that an employee’s reported performance reflects a fair assessment.  For example, input should be sought from those with whom the person interacts regularly to get a complete picture of the person’s contribution; a person’s “direct supervisor” is often least aware of the person’s true performance or value to the organization.  A well-executed evaluation process minimizes “coattailing” and self-serving cancellations, while helping to retain high-potential employees such as creative thinkers and innovators.
     Leaders should require evidence of wrongdoing or inappropriate behavior, such as harassment, before taking action against an accused employee.  The motivations of the accuser should also be explored to eliminate personal gains through false claims.  Likewise, cases of “stolen merit” – taking undeserved credit to advantage oneself over a colleague – must be dealt with firmly to maintain trust in the organization and its leadership.  To limit these cases, leaders should vehemently, repeatedly, debunk the zero-sum fallacy – the belief that the success of another diminishes one’s own position or value.
     Finally, the choice of supervisors, managers, and leaders determines an organization’s ability to sustain a healthy culture.  Every member of an organization has a responsibility to perform his/her duties ethically, to act with integrity.  The burden increases, however, as one rises in the organization, gaining authority and responsibility; most importantly, responsibility for the fair treatment of subordinates or “direct reports.”
     As discussed in Sustainability Begins with Succession Planning, a number of questions should be answered to determine a person’s suitability to a position of authority, such as “Does this person set a desirable example of attitude, teamwork, etc.?”  A candidate’s technical ability is usually much easier to assess than his/her character.  For example, does the candidate exhibit a social dominance orientation or does s/he embrace servant leadership?  If the answer to a single question can predict an individual’s performance in a leadership role and, by extension, the success of the entire organization, it is probably this one.
 
     Managerial schizophrenia and workplace cancel culture are manifestations of a principal-agent problem.  The interests of the individual (agent) are in conflict with those of the organization (principal); the agent is expected to act in the best interest of the principal, but is incentivized to act in his/her own best interest instead.  Some will dismiss it as “human nature” or “survival instinct,” but justifications make the behaviors no less destructive of organizations or individuals.
     An organization that allows a toxic culture to persist is like a tree that rots from the center.  Despite the decay in its core, the tree’s bark looks surprisingly healthy, even as it falls through the roof of your house.
 
     For additional guidance or assistance with Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
References
[Link] “On Protoplasmic Inertia.”  Gary S. Vasilash.  Automotive Design and Production, July 2002.
[Link] “When and why leaders put themselves first:  Leader behavior in resource allocations as a function of feeling entitled.”  David de Cremer and Eric van Dijk.  European Journal of Social Psychology, January 2005.
[Link] “Being Ethical is Profitable.”  Alankar Karpe.  ProjectManagement.com, August 3, 2015.
[Link] “How Workplace Fairness Affects Employee Commitment.”  Matthias Seifert, Joel Brockner, Emily C. Bianchi, and Henry Moon.  MIT Sloan Management Review, November 5, 2015.
[Link] “The Unwitting Accomplice:  How Organizations Enable Motivated Reasoning and Self-Serving Behavior.”  Laura J. Noval and Morela Hernandez.  Journal of Business Ethics, September 21, 2017.
[Link] “Putting an End to Leaders’ Self-Serving Behavior.”  Morela Hernandez.  MIT Sloan Management Review, October 18, 2017.
[Link] “Why People Believe in Their Leaders – or Not.”  Daniel Han Ming Chng, Tae-Yeol Kim, Brad Gilbreath, and Lynne Andersson.  MIT Sloan Management Review, August 17, 2018.
[Link] “New Ways to Gauge Talent and Potential.”  Josh Bersin and Tomas Chamorro-Premuzic.  MIT Sloan Management Review, November 16, 2018.
[Link] “Does your company suffer from broken culture syndrome?”  Douglas Ready.  MIT Sloan Management Review, January 10, 2022.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Making Decisions – Vol. IX:  Decision Trees]]>Wed, 23 Feb 2022 15:30:00 GMThttp://jaywinksolutions.com/thethirddegree/making-decisions-vol-ix-decision-trees     Thus far, the “Making Decisions” series has presented tools and processes used primarily for prioritization or single selection decisions.  Decision trees, in contrast, can be used to aid strategy decisions by mapping a series of possible events and outcomes.
     Its graphical format allows a decision tree to present a substantial amount of information, while the logical progression of strategy decisions remains clear and easy to follow.  The use of probabilities and monetary values of outcomes provides for a straightforward comparison of strategies.
Constructing a Decision Tree
     In lieu of abstract descriptions and generic procedures, decision tree development will be elucidated by working through examples.  To begin, the simple example posed by Magee (1964) of the decision to host a cocktail party indoors or outdoors, given there is a chance of rain, will be explored.
     In this example, there is one decision with two options (indoors or outdoors) and one chance event with two possible outcomes (rain or no rain, a.k.a. “shine”).  This combination yields four possible results, as shown in the payoff table of Exhibit 1.  It could also be called an “outcome table,” as it only contains qualitative descriptions; use of the term “payoff table” becomes more intuitive with the introduction of monetary values of outcomes.
     While the payoff table is useful for organizing one’s thoughts pertaining to the potential results of a decision, it lacks the intuitive nature of a graphical representation.  For this, the information contained in the payoff table is transferred to a decision tree, as shown in Exhibit 2.
     Read from left to right, a decision tree presents a timeline of events.  In this case, there is the decision (indoors or outdoors), represented by a square, followed by a chance event (rain or shine), represented by a circle, and the anticipated outcome of each combination (“disaster,” “real comfort,” etc.), represented by triangles.  The square and circles are nodes and each triangle is a leaf; each are connected by branches.  A summary of decision tree elements is provided in Exhibit 3.
     The decision tree of Exhibit 2 provides an aesthetically pleasing presentation of the cocktail party host’s dilemma.  However, beyond organizing information, its descriptive nature has done little to assist the host in making the required decision.
 
     The power of the decision tree becomes evident when the financial implications of decisions are presented for analysis and comparison.  To demonstrate this, we modify the cocktail party example to be more than a purely social gathering.  Let’s say it is a fundraising event where attendees’ generosity is directly linked to their satisfaction with the event (i.e. comfort level).

     This hypothetical fundraising scenario results in the payoff table of Exhibit 4 and the decision tree of Exhibit 5 that present anticipated amounts to be raised in each situation.  Clearly, this is an important decision; the collection potential ranges from $30,000 to $100,000.  However, the host is no closer to a rational decision, left to clutch a rabbit’s foot or make the decision with a dartboard.
     What information would help the host decide?  The probability of rain, naturally!  The local meteorologist has forecast a 60% chance of rain during the fundraiser; this information is added to the decision tree, as shown in Exhibit 6.  The best decision is still not obvious, but we are getting close now!
     To compare the relative merits of each option available, we calculate expected values.  The expected value (EV) of an outcome is the product of its anticipated monetary value (payoff) and the probability of its occurrence.  For example, a rainy outdoor event has an anticipated payoff of $30,000 and a 0.60 probability of occurrence.  Therefore, EV = $30,000 x 0.60 = $18,000.  Likewise, the expected value of an outdoor event with no rain is EV = $100,000 x 0.40 = $40,000.  The EV of a chance node is the sum of its branches’ EVs; thus EVout = $18,000 + $40,000 = $58,000 for an outdoor event.  The expected value of an indoor event is calculated in the same way; EVin = $62, 000.
     Expected value calculations are shown to the right of the decision tree in Exhibit 7.  The calculations are not typically displayed; they are included here for instructional purposes.
     The EV of a decision node, or its position value, is equal to that of the preferred branch – the one with the highest payoff or lowest cost (if all are negative).  Hence, the position value of this decision is $62,000 and the event will be held indoors.  Two hash marks are placed through the “outdoors” branch to signify that it has not been selected; this is called “pruning” the branch.  The completed, pruned decision tree is shown in Exhibit 7.
     The process of calculating expected values and pruning branches is performed from right to left.  Reflecting the reversal of direction, this process is called “rolling back” or “folding back” the decision tree, or simply “rollback.”
 
     Use of a decision tree is advantageous for relatively simple situations like the example above.  Its value only increases as the decision environment becomes more complex.  Again, this is demonstrated by expanding the previous example.  In addition to the indoor/outdoor decision, consideration will also be given to a public fundraiser as well as the private, “invitation-only” event previously presented.
     While public events tend to extract smaller individual donations, increased attendance can offset this, though attendees’ generosity remains linked to their comfort.  An expanded payoff table, presented in Exhibit 8, includes fundraising estimates for a public event.  (This format was chosen because a three-dimensional table is graphically challenging; tabular representation of longer decision chains become confusing and impractical.)
     The decision tree, shown in Exhibit 9, now presents two decisions and one chance event.  The upper decision branch consists of the previous decision tree of Exhibit 7, while the lower branch presents the public event information. In this scenario, the private cocktail party is abandoned (its branch is pruned); instead, a public event will be held.  The position value of the root node – the first decision to be made – is equal to the EV of the preferred strategy.  The path along the retained branches, from root node to leaf, defines the preferred strategy.

     Expanding further the fundraising event example provides a more realistic decision-making scenario, where decisions and chance events are interspersed.  This example posits that the fundraiser is being planned for the political campaign of a candidate that has not yet received a nomination.  Pundits have assigned a 50% probability that the candidate will prevail over the primary field to receive the nomination.  Given this information, the campaign manager must decide if an event venue and associated materials should be reserved in advance.  From experience, the campaign manager believes that costs will double if event planning is postponed until after the nomination is secured.
     The decision tree for the pre-nomination reservation decision is presented in Exhibit 10.  The format has been changed slightly from previous examples in order to present some common attributes.  Larger trees, in particular, benefit from reducing the “clutter” created by the presentation of information along the branches.  One way to do this is to align information in “columns,” the headings of which clarifies a more succinct branch label than would otherwise be possible.  In this example, the headings provide a series of questions to which “yes” or “no” responses provide all the information required to understand the tree.  Reading the tree is easier when it is not crowded by the larger labels of “indoors,” “private,” “nominated,” etc.
     The upper right portion of Exhibit 10 contains the same tree as Exhibit 9, though it has been rearranged.  The tree in Exhibit 10 is arranged such that the topmost branches represent “yes” responses to the heading questions.  While this is not required, many find decision trees organized this way easier to read and follow the logic of the chosen strategy.
     The example of Exhibit 9 utilized a single monetary value, the total funds raised, to make decisions.  It could be assumed that this is a net value (expenses had been deducted prior to constructing the tree) or that expenses are negligible.  In this regard, Exhibit 10, again, presents a more realistic scenario where expenses are significant and worthy of explicit analysis.  The cost of hosting each type of event is presented in its own column and requires an additional calculation during rollback.  The position value of the decision node is no longer equal to the EV of the subsequent chance node as in previous examples.  To determine the position value of the indoors/outdoors decision, the cost of the preferred venue must be deducted from the EV of its branch.
     A noteworthy characteristic of this example is the atypical interdependence of the value of a branch.  Notice that the cost of the “no event” branch following a decision to reserve a venue is $16,000.  This cost was determined by first rolling back the upper portion of the tree.  The cost of the preferred branch in the “nominated” scenario is $16,000; therefore, this is the cost to be incurred, whether or not an event is held, because it follows the decision to reserve a venue.  The cost of the preferred venue is simply duplicated on the “no event” branch.  It is important not to overlook this cost; it reduces the EV of the “Nominated?” chance node by $8,000 ($16,000 cost x 0.50 probability).
     Another interesting characteristic of the campaign fundraiser decision, as seen in Exhibit 10, is that the position value of the reservation decision is the same, regardless of the decision made.  The example was not intentionally designed to achieve this result; values and probabilities were chosen arbitrarily.  It is a fortunate coincidence, however, as it provides additional realism to the scenario and allows us to explore more fully the role of this, or any, decision-making aid.
     An inexperienced decision-maker may, upon seeing equal EVs at the root decision, throw up his/her hands in frustration, assuming that the effort to construct the decision tree was wasted.  This, of course, is not true; you may have realized this already.  A theme running throughout the “Making Decisions” series is that decision-makers are aided, not supplanted, by the tools presented.  Expected value is only one possible decision criterion; further analysis is needed.
     Notice that the “no” branch of the “Reserve?” decision has been pruned in Exhibit 10.  Why? A thorough answer requires a discussion of risk attitudes, introduced in the next section.  For now, we will approach it by posing the following question:  Given the expected value of a reservation is equal to that of no reservation, is it advisable to make the reservation anyway?  Consider a few reasons why it is:
  • Demonstrated optimism in the candidate’s eventual nomination may provide a psychological edge to the campaign.
  • A secured reservation reduces the campaign staff’s workload and time pressure post-nomination, should it be received.
  • A secured reservation reduces uncertainty.  For example, the cost of securing a venue “at the last minute” may be underestimated as circumstances evolve.
  • A secured reservation may provide the candidate leverage should the nomination not be received.  The eventual nominee may be less prepared to host an event of his/her own, for example.
  • A refund, partial or full, may be negotiated, should the event be cancelled, raising the EV of the reservation.
      The points above include noneconomic, or nonmonetary, factors that are relevant to the decision.  With experience, decision-makers become more aware of noneconomic factors that influence an organization’s success and more capable of effectively incorporating them in decision strategies.
 
Advanced Topics in Decision Tree Analysis
     The discussion of decision tree analysis could be expanded far beyond the scope of this post.  Only experienced practitioners should attempt to incorporate many of the advanced elements, lest the analysis go awry.  This section serves to expose readers to opportunities for future development; detailed discussions may appear in future installments of “The Third Degree.”  Eager readers can also consult references cited below or conduct independent research on topics of interest.
 
     Risk attitudes, mentioned in the previous section, influence decisions when analysis extends beyond the decision tree.  The strategy described in the example – reserving a venue for an indoor public event (see Exhibit 10) – is a risk averse strategy; it seeks to avoid the risks associated with foregoing a reservation.  Some of these risks were mentioned in the justification for the equal-EV decision presented and include economic and noneconomic factors.
     If the campaign manager has a risk seeking attitude, s/he is likely to reserve an outdoor venue for a public event, despite the lower expected value of such a decision.  A risk seeking decision-maker will pursue the highest possible payoff, irrespective of its expected value or the risks involved.  In the example, this corresponds to the $140,000 of contributions anticipated at a public outdoor event.
     Risk neutral decision-makers accept the strategy with the highest expected value without further consideration of the risks involved.  This option does not exist in our example; further consideration was required to differentiate between two strategies with equal expected values.
 
     If a decision-maker is accustomed to using utility functions, expected values can be replaced with expected utilities (EU) for purposes of comparing strategies.  A concept related to both risk attitude and expected utility is the certainty equivalent (CE).  A decision’s CE is the “price” for which a decision-maker would transfer the opportunity represented by the decision to another party, foregoing the upside potential to avoid the downside risk.
 
     The examples presented have assumed a short time horizon; the time that elapses between decisions and/or chance events is not sufficient to influence the strategy decision.  Decisions made on longer time horizons are influenced by the time value of money (TVM).  The concept of TVM is summarized in the following way:  an amount of money received today is worth more than the same amount received on any future date.  Thus, expected values change over long time horizons; the extent of this change may be sufficient to alter the preferred strategy.
 
     There are numerous software tools available to construct decision trees; many are available online and some are free.  The output of some will closely resemble the decision trees presented here, using conventional symbols and layout.  Others differ and may be more difficult to read and analyze.  While software-generated decision trees are often aesthetically pleasing, suitable for presentation, those drawn freehand while discussing alternatives with colleagues may be the most useful.
     Use of the venerable spreadsheet is also a viable option.  Generating a graphical presentation of a decision tree in a spreadsheet can be tedious, but it offers advantages unavailable in other tools.  First and foremost, the ubiquity of spreadsheets ensures a high probability that anyone choosing to engage in decision tree analysis already has access to and familiarity with at least one spreadsheet program.  With sufficient experience, one can forego the tedium of generating graphics until a formal presentation is required.  Such a spreadsheet might look like that in Exhibit 11, which shows the campaign fundraiser example in a non-graphical format.  Each branch of the decision tree is shown between horizontal lines; pruned branches are shaded.  Expected values of chance nodes are labeled “EV” and those of decision nodes are labeled “position value” to differentiate them at a glance without use of graphic symbols.
     Calculations and branch selections/pruning can be automated in the spreadsheet.  A significant advantage of the spreadsheet is the ability to quickly perform simple sensitivity analysis.  Several scenarios can be rapidly explored by adjusting payoffs and probabilities and allowing the spreadsheet to recalculate.  More sophisticated analysis tools are also available in most spreadsheet programs; only experienced users should attempt to utilize them for this purpose, however.
 
Decision Tree Tips in Summary
     To conclude this introduction, the final section consists of brief reminders and tips that facilitate effective decision tree analysis.
     The example decision trees presented are symmetrical, but this need not be the case.  Likewise, decisions and chance events need not be binary; three, four, or more options or potential outcomes may be considered.  However, the rule remains:  the potential outcomes of a chance event must be mutually exclusive (only one can occur) and collectively exhaustive (probabilities sum to 1, or 100%).
     The examples presented considered positive position values.  However, EVs and position values could be negative.  For example, there may be multiple options for facility upgrades to maintain regulatory compliance.  None will increase revenue, so all EVs are negative.  In such case, the preferred branch would be the one reflecting the lowest cost.
     All relevant costs and only relevant costs should be included in EV calculations.  Sunk costs and any others that occur regardless of the strategy selected should be excluded.  Taxes and interest charges may be relevant, as are opportunity costs and noneconomic factors.
     Construct the decision tree from left to right; roll back from right to left.  Format it to be easily expanded when new alternatives are identified or new events or decisions need to be included in the analysis.
     To compact a decision tree for presentation, branches can be hidden, allowing the position value of the origin decision to represent the hidden branches.  Also, consider combining branches when events occur together or are otherwise logically coupled.  Branches should only be combined when doing so does not eliminate a viable alternative strategy.
     Reviews of decision trees should take place as events unfold and after the final outcome has occurred.  Adjustments may not be possible to the strategy under review, but estimates of probabilities and payoffs may be improved for future decisions.
 
     Use of decision trees for organizational decision-making is consistent, analytical, and transparent.  However, decision tree analysis exhibits the same weakness as all other decision-making aids – it is dependent upon reliable estimates and sound judgment to be effective.  Review of past decisions and outcomes is critical to increasing the quality of an organization’s decisions.
 
 
     For additional guidance or assistance with decision-making or other Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment with a “strategy arborist.”
 
     For a directory of “Making Decisions” volumes on “The Third Degree,” see Vol. I:  Introduction and Terminology.
 
References
[Link] “Decision Trees for Decision Making.”  John F. Magee.  Harvard Business Review, July, 1964.
[Link] “Decision Tree Primer.”  Craig W. Kirkwood.  Arizona State University, 2002.
[Link] “Decision Tree Analysis:  Choosing by Projecting ‘Expected Outcomes.’”  Mind Tools Content Team.
[Link] “Decision Tree: Definition and Examples.”  Stephanie Glen, September 3, 2015.
[Link] “What is a Decision Tree Diagram.”  LucidChart.
[Link] Mathematical Decision Making:  Predictive Models and Optimization.  Scott P. Stevens.  The Great Courses, 2015.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Making Decisions – Vol. VIII:  The Pugh Matrix Method]]>Wed, 09 Feb 2022 16:00:00 GMThttp://jaywinksolutions.com/thethirddegree/making-decisions-vol-viii-the-pugh-matrix-method     A Pugh Matrix is a visual aid created during a decision-making process. It presents, in summary form, a comparison of alternatives with respect to critical evaluation criteria.  As is true of other decision-making tools, a Pugh Matrix will not “make the decision for you.”  It will, however, facilitate rapidly narrowing the field of alternatives and focusing attention on the most viable candidates.
     A useful way to conceptualize the Pugh Matrix Method is as an intermediate-level tool, positioned between the structured, but open Rational Model (Vol. II) and the thorough Analytic Hierarchy Process (AHP, Vol. III).  The Pugh Matrix is more conclusive than the former and less complex than the latter.
     Typical practice of “The Third Degree” is to present a tool or topic as a composite, building upon the most useful elements of various sources and variants.  The Pugh Matrix Method is a profound example of this practice; even the title is a composite!  Information about this tool can be found in the literature under various headings, including Pugh Matrix, Decision Matrix, Pugh Method, Pugh Controlled Convergence Method, and Concept Selection.
     This technique was developed in the 1980s by Stuart Pugh, Professor of Design at Strathclyde University in Glasgow, Scotland.  Although originally conceived as a design concept selection technique, the tool is applicable to a much broader range of decision contexts.  It is, therefore, presented here in more-generic terms than in Pugh’s original construction.
 
The Pugh Matrix Method
     A schematic diagram of the Pugh Matrix is presented in Exhibit 1.  To facilitate learning to use the matrix, column and row numbers are referenced [i.e. col(s). yy, row(s) xx] when specific areas are introduced or discussed.  Column and row numbers may differ in practice, as the matrix is expanded or modified to suit a specific decision.  Instructions and guidelines for completing the Pugh Matrix, based on the schematic of Exhibit 1, are given below.
     The first step of any decision-making process is to clearly define the decision to be made.  A description of the situation faced – problem, opportunity, or objective – clarifies the purpose of the exercise for all involved.  This description should include as much detailed information as is necessary for participants to effectively evaluate alternatives.  A concise statement that encapsulates this information should also be crafted to be used as a title for summary documentation.  Enter this title in the Decision box above the matrix.
     Below the matrix, in the Notes boxes, collect information as the matrix is constructed that is needed to “follow” the analysis and understand the decision rationale.  Space is limited to encourage the use of shorthand notes only; some typical contents are suggested in the schematic (Exhibit 1).  Detailed discussions and explanations, including the long-form decision description, should be documented separately.  The Pugh Matrix is typically a component of an analysis report where these details are also recorded.  Detailed information on the alternatives evaluated should also be included.
 
     A well-defined decision informs the development of a list of relevant evaluation criteria or specifications.  Each criterion or specification should be framed such that a higher rating is preferred to a lower one.  Ambiguous phrasing can lead to misguided conclusions.  For example, evaluators may interpret the criteria “cost” differently.  Some may rate higher cost with a higher score, though the reverse is intended.  A criterion of “low cost” or “cost effectiveness,” while referring to the same attribute and data, may communicate the intent more clearly.  Compile the list of criteria in the Evaluation section of the matrix (col. 1, rows 4 - 8).
     Next, the criteria are weighted, or prioritized, in the Criteria Weight column (col. 2, rows 4 - 8).  A variety of weighting scales can be used; thus it is critical that the scale in use be clearly defined prior to beginning the process.  A 1 – 10 scale offers simplicity and familiarity and is, therefore, an attractive option.  And 1 – 10 scale exemplar is provided in Exhibit 2.  The universal scale factor descriptions may be employed as presented or modified to suit a particular decision environment or organization.  Specify the scale in use in the Criteria Weight box (col. 2, row 2).
     With the relevant criteria in mind, it is time to brainstorm alternatives.  Before entering the alternatives in the matrix, the list should be screened as discussed in Project Selection – Process, Criteria, and Other Factors.  The matrix is simplified when alternatives that can quickly be determined to be infeasible are eliminated from consideration.  Enter the reduced list in the Alternatives section of the matrix (cols. 3 - 6, row 3).  The first alternative listed (col. 3, row 3) is the baseline or “datum”, typically chosen to satisfy one of the following conditions:
  • The incumbent (i.e. current, existing) product, process, etc.
  • A competitive product, technology, etc.
  • The most fully-defined or most familiar alternative, if no incumbent or other “obvious” choice exists.
  • Random choice, if no incumbent exists and alternatives cannot otherwise be sufficiently differentiated prior to analysis.
     Above each alternative identification, an additional cell is available for a “shortcut” reference.  This shortcut reference can take many forms, including:
  • A sketch representing a simple design concept (e.g. sharp corner, round, chamfer).
  • A single word or short phrase identifying the technology used (e.g. laser, waterjet, EDM, plasma).
  • A symbol representing a key element of the alternative (e.g. required PPE or recycling symbols for each type of material waste generated).
  • Any other reference that is relevant, simple, and facilitates analysis.
     If appropriate shortcut references are available, add them to the matrix (cols. 3 - 6, row 2) to accompany the formal alternative identifications.  Do not add them if they are not helpful reminders of the details of the alternatives, are ambiguous, misleading, or distracting, or otherwise fail to add value to the matrix.  They are not required in the matrix and should only be added if they facilitate the evaluation process.
 
     The next step in the process is to select an evaluation scale; 3-, 5-, and 7-point scales are common, though others could be used.  The scale is centered at zero, with an equal number of possible scores above and below.  The 3-point scale will have one possible score above (+1) and one possible score below (-1) zero.  The 5-point scale will have two possible scores on each side of zero (+2, +1, 0, -1, -2) and so on.  Larger scales can also be constructed; however, the smallest scale that provides sufficient discrimination between alternatives is recommended.  A positive score indicates a preference to the baseline and a negative score indicates that an alternative is less preferred to the baseline, or disfavored, with respect to that criterion.  Larger numbers (when available in the scale selected) indicate the magnitude of the preference.  Specify the scale in use in the Evaluation Scale box (col. 2, row 2).
     Complete the Evaluation section of the matrix by rating each alternative.  Conduct pairwise comparisons of each alternative and the baseline for each criterion listed, entering each score in the corresponding cell (cols. 4 - 6, rows 4 - 8).  By definition, the baseline alternative scores a zero for each criterion (col. 3, rows 4 - 8).
 
     Once all of the alternatives have been scored on all criteria, the Summary section of the matrix can be completed.  The expanded Summary section presented in Exhibit 1 allows detailed analysis and comparisons of performance while remaining easy to complete, requiring only simple arithmetic.
     To populate the upper subsection of the Summary, tally the number of criteria for which each alternative is equivalent to (score = 0), preferred to (score > 0), and less preferred than (score < 0) the baseline.  Enter these numbers in the # Criteria = 0 (cols. 4 - 6, row 9), # Criteria > 0 (cols. 4 - 6, row 10), and # Criteria < 0 (cols. 4 - 6, row 11) cells, respectively.  The baseline’s tallies will equal the number of criteria, zero, and zero, respectively.
     The lower subsection of the Summary displays the weighted positive (preferred) and negative (disfavored) scores for each alternative.  For each criterion for which an alternative is preferred, multiply its criteria weight by the evaluation score; sum the products in the Weighted > 0 cell for each alternative (cols. 4 - 6, row 12).  Likewise, sum the products of the weights and scores of all criteria for which an alternative is disfavored and enter the negative sum in the Weighted < 0 cell for that alternative (cols. 4 - 6, row 13).  Again, the baseline receives scores of zero (col. 3, rows 12 - 13).
 
     The final portion of the matrix to be populated is the Rank section, which includes the Total Score (cols. 3 - 6, row 14) and Gross Rank (cols. 3 - 6, row 15) of each alternative.  The Total Score is calculated by summing the products of weights and scores for all criteria or simply summing the positive and negative weighted scores.  Again, the definition of baseline scoring requires it to receive a zero score.  Other alternatives may earn positive, negative, or zero scores.  A positive Total Score implies the alternative is “better than,” or preferred to, the baseline, while a negative score implies it is “worse than” the baseline, or disfavored.  A Total Score of zero implies that the alternative is equivalent to the baseline, or exhibits similar overall performance.  Any two alternatives with equal scores are considered equivalent and decision-makers should be indifferent to the two options until a differentiating factor is identified.
     Finally, each alternative is assigned a Gross Rank.  The alternative with the highest Total Score is assigned a “1,” the next highest, a “2,” and so on.  Alternatives with equal Total Scores will be assigned equal rank to signify indifference.  The next rank number is skipped; the lowest rank equals the number of alternatives.  The exception occurs where there is a tie at the lowest score, in which case the lowest rank is equal to the highest ranking available to the equivalent alternatives.  The following examples illustrate the method of ranking with equal scores:
  • Of six alternatives, three are scored equally at second rank.  The Gross Rank of these six alternatives is, therefore, [1–2–2–2–5–6].
  • Of six alternatives, the two lowest-scoring alternatives are equivalent.  The Gross Rank of these six alternatives is, therefore, [1–2–3–4–5–5].
The hierarchy of alternative preferences is called Gross Rank because further analysis may lead decision-makers to modify the rank order of alternatives or “break ties.”  This will be discussed in more detail later.
 
     Review the Summary section of the matrix to validate the ranking of alternatives and to “break ties.”  The number of preferred and disfavored characteristics can influence the priority given to equally-scored alternatives.  Also, the magnitude of the weighted positive and negative scores may sufficiently differentiate two alternatives with equal Total Scores and, therefore, equal Gross Rank.  Alternatives with weighted scores of [+5, -3] and [+3, -1] each receive a Total Score of +2.  However, further consideration may lead decision-makers to conclude that the additional benefits represented by the higher positive score of the first alternative do not justify accepting the greater detriment of its higher negative score.  Thus, the second alternative would be ranked higher than the first in the final decision, despite its lower positive score.  [See “Step 5” of The Rational Model (Vol. II) for a related discussion.]
     It is also possible for all alternatives considered to rank below the baseline (i.e. negative Total Score).  That is, the baseline achieves a rank of “1.”  If the baseline alternative in this scenario is the status quo, the do nothing option is prescribed.  If the do nothing option is rejected, a new matrix is needed; this is discussed further in “Iterative Construction,” below.
 
     Conducting sensitivity analysis can increase confidence in the rankings or reveal the need for adjustments.  This can be done in similar fashion to that described for AHP (Vol. III).  Common alterations include equal criteria weights (i.e. = 1, or no weighting) and cost-focus (weighting cost significantly more than other criteria).  If there was difficulty reaching consensus on criteria weights, the influence of the contentious criteria can be evaluated by recalculating scores for the range of weights initially proposed.  Likewise, the impact of zero-score evaluations, resulting from disagreement about an alternative’s merits relative to the baseline, can be similarly explored.  No definitive instruction can be provided to assess the results of sensitivity analysis; each decision, matrix, and environment is unique, requiring decision-makers to apply their judgment and insight to reach appropriate conclusions.
 
Iterative Construction
     A completed Pugh Matrix provides decision-makers with an initial assessment of alternatives.  However, the matrix may be inconclusive, incomplete, or otherwise unsatisfactory.  As mentioned above, there may be decisions to be made within the matrix.  The necessity of minor adjustments may be easily communicated without further computations, while others may warrant constructing a new matrix with modifications incorporated.  Thus, a thorough analysis may require an iterative process of matrix development.
     There are many changes that can be made for subsequent iterations of matrix construction.  Everything learned from previous iterations should be incorporated in the next, to the extent possible without introducing bias to the analysis.  A number of possible modifications and potential justifications are presented below.
  • Refine the decision description.  If the evaluation process revealed any ambiguity in the definition of the decision to be made, clarify or restate it.  The purpose of the matrix must be clear for it to be effective; a “narrowed” view may be needed to achieve this.
  • Refine criteria definitions.  If the framing of any evaluation criterion has caused difficulty, such as varying interpretation by multiple evaluators, adjust its definition.  Consistent interpretation is required for meaningful evaluation of alternatives.
  • Add or remove criteria.  Discussion and evaluation of alternatives may reveal relevant criteria that had been previously overlooked; add them to the list.  Criteria that have failed to demonstrate discriminatory power can be removed.  This may occur when one criterion is strongly correlated with another; the alternatives may also be legitimately indistinguishable in a particular dimension.  Criteria should not be eliminated hastily, however, as results of another iteration may lead to a different conclusion.  Therefore, a criterion should be retained for at least two iterations and removed only if the results continue to support doing so.  New alternatives should also be evaluated per any eliminated criterion to validate its continued exclusion from the matrix.
  • Modify the Criteria Weighting Scale.  Though included for completeness, modifying the criteria weighting scale is, arguably, the least helpful adjustment that can be made.  It is difficult to conceive of one more advantageous than the “standard” 1 – 10 scale; it is recommended as the default.
  • Review Criteria Weights.  If different conclusions are reached at different steps of the analysis and review, criteria weighting may require adjustment.  That is, if Evaluation Tallies, Total Scores, and sensitivity analysis indicate significantly different preferences, the Criteria Weights assigned may not accurately reflect the true priorities.  Criteria Weights should be revised only with extreme caution, however; bias could easily be introduced, supporting a predetermined, but suboptimal, conclusion.
  • Add or remove alternatives.  Discussion and evaluation may lead to the discovery of additional alternatives or reveal opportunities to combine favorable characteristics of existing alternatives to create hybrid solutions.  Add any new alternatives developed to the matrix for evaluation.  Alternatives that are comprehensively dominated in multiple iterations are candidates for elimination.
  • Select a different baseline.  Comparing alternatives to a different baseline may interrupt biased evaluations, improving the accuracy of assessments and rankings.  Varied perspectives can also clarify advantages of alternatives, facilitating final ranking decisions.
  • Modify “shortcut” references.  If shortcut references do not provide clarity that facilitates evaluation of alternatives, modify or remove them.  It is better for them to be absent than to be confusing.
  • Refine the Evaluation Scale.  Implementing an evaluation scale with a wider range (replacing a 3-point scale with a 5- or 7-point scale, for example) improves the ability to differentiate the performance of alternatives.  Increased discriminatory power allows the capture of greater nuance in the evaluations, reducing the number of equivalent or indifferent ratings and creating a discernible hierarchy of alternatives.
     Additional research may also be needed between iterations.  Decision-making often relies heavily on estimates of costs, benefits, and performance as assessed by a number of metrics.  Improving these estimates may be necessary for meaningful comparison of alternatives and a conclusive analysis.
 
Pugh Matrix Example
     To demonstrate practical application of the Pugh Matrix Method, we revisit the hypothetical machine purchase decision example of Vol. III.  The AHP example presented was predicated on a prior decision (unstated, assumed) to purchase a new machine; only which machine to purchase was left to decide.  The detailed decision definition, or objective, was “Choose source for purchase of new production machine for Widget Line #2.”  For the Pugh Matrix example, the premise is modified slightly; no prior decision is assumed.  The decisions to purchase and which machine to purchase are combined in a single process (this could also be done in AHP).  To so formulate the decision, it is defined as “Determine future configuration of Widget Line #2.”
     Evaluation criteria are the same as in the AHP example, weighted on a 1 – 10 scale.  Criteria weights are chosen to be comparable to the previous example to the extent possible.  The evaluation criteria and corresponding weights are presented in Exhibit 3.
     The alternatives considered are also those from the AHP example, with one addition:  maintaining the existing widget machine.  The cost and performance expectations of each alternative are presented in Exhibit 4.  Maintaining the existing equipment is the logical choice for baseline; its performance and other characteristics are most familiar.  Consequently, estimates of its future performance are also likely to be most accurate.
     “Shortcut” reference images are inserted above the alternative (source company) names.  The performance summary is sufficiently brief that it could have been used instead to keep key details in front of evaluators at all times.  For example, the shortcut reference for the baseline could be [$0.8M_50pcs/hr_6yrs].
     To evaluate alternatives, an “intermediate” scale is chosen.  Use of a 5-point scale is demonstrated to provide balance between discrimination and simplicity.  The scoring regime in use is presented in Exhibit 5.
     Each alternative is evaluated on each criterion and the scores entered in the Evaluation section of the example matrix, presented in Exhibit 6.  Alternative evaluations mirror the assessments in the AHP example of Vol. III to the extent possible.  There should be no surprises in the evaluations; each alternative is assessed a negative score on Cost and positive scores on Productivity and Service Life.  This outcome was easily foreseeable from the performance summary of Exhibit 4.  After populating the Evaluation section, the remainder of the matrix is completed with simple arithmetic, as previously described.
     A cursory review of the Summary section reveals interesting details that support the use of this formulation of the Pugh Matrix Method to make this type of decision.  First, the three new machine alternatives are equal on each of the score tallies.  Without weighting, there would be nothing to differentiate them.  Second, the Jones and Wiley’s machines have equal negative weighted scores.  This could be of concern to decision-makers, particularly if no clear hierarchy of preferences is demonstrated in the Gross Rank.  Were this to be the case, repeating the evaluations with a refined scale (i.e. 7-point) may be in order.
     Finally, the Pugh Matrix Method demonstrated the same hierarchy of preferences (Gross Rank) as did AHP, but reached this conclusion with a much simpler process.  This is by no means guaranteed, however; the example is purposefully simplistic.  As the complexity of the decision environment increases, the additional sophistication of AHP, or other tools, becomes increasingly advantageous.
     Sensitivity analysis reveals that the “judgment call” made to score Wiley’s Productivity resulted in rankings that match the result of the AHP example.  Had it been scored +2 instead of +1, the ranks of Wiley’s and Acme would have reversed.  Again, a refined evaluation scale may be warranted to increase confidence in the final decision.
 
Variations on the Theme
     As mentioned in the introduction, the Pugh Matrix Method is presented as a composite of various constructions; every element of the structure presented here may not be found in another single source.  Also, additional elements that could be incorporated in a matrix can be found in the literature.  Several possible variations are discussed below.
     In its simplest form, the Summary section of the Pugh Matrix contains only the Evaluation Tallies, as weights are not assigned to criteria.  Evaluation of alternatives is conducted on a +/s/- 3-point “scale,” where an alternative rated “+” is preferred to baseline, “-“ is disfavored relative to baseline, and “s” is the same as, or equivalent to, baseline in that dimension.  Refining the evaluation scale consists of expanding it to ++/+/s/-/-- or even +++/++/+/s/-/--/---. Larger scales inhibit clarity for at-a-glance reviews.  Until the number of criteria and/or alternatives becomes relatively large, the value of this “basic” matrix is quite limited.  A basic Pugh Matrix for the example widget-machine purchasing decision is presented in Exhibit 7.  As mentioned in the example, a coarse evaluation scale and the absence of criteria weighting result in three identically scored alternatives.  The matrix has done little to clarify the decision; it reinforces the decision to buy a new machine, but has not helped determine which machine to purchase.
     Features not present in the basic matrix were chosen for inclusion in the recommended construction.  Some features included (shown in “Pugh Matrix Example,” above), and their benefits, are:
  • Expanded Summary section
    • Greater insight into relative performance of alternatives.
  • 5-Point Numerical Evaluation Scale
    • At-a-glance clarity.
    • Compatibility with and simplicity in use of spreadsheet for calculations.
    • Balance between discrimination and simplicity.
  • 1 – 10 Criteria Weight Scale
    • Familiarity and simplicity of scale.
    • Greater discriminatory power of matrix.
 
     The Summary section in our matrix could be called the “Alternative Summary” to differentiate it from a “Criteria Summary.”  The purpose of a Criteria Summary, ostensibly, is to evaluate the extent to which each requirement is being satisfied by the available alternatives.  Our example analysis, with Criteria Summary (cols. 9 – 14, rows 4 – 13), is shown in the expanded matrix presented in Exhibit 8.  It is excluded from the recommended construction because of its potential to be more of a distraction than a value-added element of the matrix.  While the Evaluation Tallies may provide an indication of the quality of the alternatives offered, it is unclear how to use the weighted scores or Total Scores to any advantage (i.e. is a score of 40 outstanding, mediocre, or in between?).  If decision-makers do perceive value in a Criteria Summary, it is a simple addition to the matrix; Evaluation Tallies, weighted scores, and Total Scores are analogous to those calculated in the Alternatives Summary.
     The use of primary and secondary criteria is also optional.  Refer to “Project Selection Criteria” in Project Selection – Process, Criteria, and Other Factors for a related discussion.  In the Project Selection discussion, primary criteria were called “categories of criteria,” while secondary criteria was shortened to, simply, “criteria.”  Though the terminology used is slightly different, either set of terms is acceptable, as the concept and advantages of use are essentially identical.  Organization, summary, and presentation of information can be facilitated by their use.  For example, reporting scores for a few primary criteria may be more appropriate in an Executive Summary than several secondary criteria scores.  However, this method is only advantageous when the number of criteria is large.  Reviewers should also beware the potential abuse of this technique; important details could be masked – intentionally or unintentionally hidden – by an amalgamated score.
 
     An alternate criteria-weighting scheme prescribes the sum of all criteria weights equal unity.  This is practical only for a small number of criteria; larger numbers of criteria require weights with additional significant digits (i.e. decimal places).  The relative weights of numerous criteria quickly become too difficult to monitor for the scores to remain meaningful.  The 1 – 10 scale is easy to understand and consistently apply.
     Nonlinear criteria weighting can significantly increase the discriminatory power of the matrix; however, it comes at a high cost.  Development of scoring curves can be difficult and the resulting calculations are far more complex.  A key advantage of the Pugh Matrix Method – namely, simplicity – is lost when nonlinear scoring is introduced.
 
     The final optional element to be discussed is the presentation of multiple iterations of the Pugh Matrix Method in a single matrix.  An example, displaying three iterations, is presented in Exhibit 9.  Features of note include:
  • Results of multiple iterations are shown side by side by side for each alternative.
  • Choice of baseline (“datum”) for each iteration is clearly identified (dark-shaded columns).
  • Eliminated alternatives are easily identified (light-shaded columns).
  • Eliminated criteria are easily identified (light-shaded rows).
  • Graphical presentation of summary scores is useful for at-a-glance reviews.
This presentation format is useful for a basic matrix (i.e. no criteria weighting, 3-point evaluation scale).  As features are added to the matrix, however, it becomes more difficult and less practical – at some point, infeasible – to present analysis results in this format.
Final Notes
     Confidence in any tool is developed with time and experience.  The Pugh Matrix Method is less sophisticated than other tools, such as AHP (Vol. III), and, thus, may require a bit more diligence.  For example, the Pugh Matrix lacks the consistency check of AHP.  Therefore, it could be more susceptible to error, misuse, or bias; the offset is its simplicity.  A conscientious decision-making team can easily overcome the matrix’s deficiency and extract value from its use.
     The Pugh Matrix is merely a decision-making aid and like any other, it is limited in power.  The outcome related to any decision is not necessarily an accurate reflection of any decision-making aid used.  It cannot overcome poor criteria choices, inaccurate estimates, inadequate alternatives, or a deficiency of expertise exhibited by evaluators.  “Garbage in, garbage out” remains true in this context.
     It is important to remember that “the matrix does not make the decision;” it merely guides decision-makers.  Ultimately, it is the responsibility of those decision-makers to choose appropriate tools, input accurate information, apply relevant expertise and sound judgment, and validate and “own” any decision made.
 
     For additional guidance or assistance with decision-making or other Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “Making Decisions” volumes on “The Third Degree,” see Vol. I:  Introduction and Terminology.
 
References
[Link] “How To Use The Pugh Matrix.”  Decision Making Confidence.
[Link] “What is a Decision Matrix?”  ASQ.
[Link] “Pugh Matrix.”  CIToolkit.
[Link] “The Systems Engineering Tool Box – Pugh Matrix (PM).”  Stuart Burge, 2009.
[Link] “The Pugh Controlled Convergence method: model-based evaluation and implications for design theory.”  Daniel Frey, et al;  Research in Engineering Design, 2009.
[Link] “Decide and Conquer.”  Bill D. Bailey and Jan Lee; Quality Progress, April 2016.
[Link] Farris, J., & Jack, H. (2011, June), Enhanced Concept Selection for Students Paper presented at 2011 ASEE Annual Conference & Exposition, Vancouver, BC. 10.18260/1-2--17895
[Link] Takai, Shun & Ishii, Kosuke. (2004). Modifying Pugh’s Design Concept Evaluation Methods. Proceedings of the ASME Design Engineering Technical Conference. 3. 10.1115/DETC2004-57512.
[Link] Kremer, Gül & Tauhid, Shafin. (2008). Concept selection methods - A literature review from 1980 to 2008. International Journal of Design Engineering. 1. 10.1504/IJDE.2008.023764.
[Link] “Concept Selection.”  Design Institute, Xerox Corporation, September 1, 1987
[Link] The Lean 3P Advantage.  Allan R. Coletta; CRC Press, 2012.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Project Selection – Process, Criteria, and Other Factors]]>Wed, 26 Jan 2022 16:00:00 GMThttp://jaywinksolutions.com/thethirddegree/project-selection-process-criteria-and-other-factors     Committing resources to project execution is a critical responsibility for any organization or individual.  Executing poor-performing projects can be disastrous for sponsors and organizations; financial distress, reputational damage, and sinking morale, among other issues, can result.  Likewise, rejecting promising projects can limit an organization’s success by any conceivable measure.
     The risks inherent in project selection compels sponsors and managers to follow an objective and methodical process to make decisions.  Doing so leads to project selection decisions that are consistent, comparable, and effective.  Review and evaluation of these decisions and their outcomes also becomes straightforward.
     Project selection is a context-specific example of decision-making.  However, its complexity, ubiquity, and criticality warrant a focused treatment separate from the Making Decisions series.
     An effective selection process includes project justification. Adjacent to this process is the prioritization of selected projects and termination decisions regarding ongoing projects, though these topics are often considered in isolation, if at all.  Selection, prioritization, execution, and termination are elements of project portfolio management; successful organizations conduct these activities in concert.
 
Project Selection Process
     An effective project selection process typically proceeds in three phases:  Initiation, Configuration, and Analysis.  Each phase and the steps that comprise them are described in this section.

Initiation
     The initiation phase of project selection is exploratory in nature, where members of the organization consider possible projects.  (It should not be confused with initiating in project management, where project execution is commenced.)  The initiation phase of project selection consists of two steps:  Generate Ideas, and Screen.

Generate Ideas.  The organization brainstorms potential projects, compiling a list for consideration, sometimes called a “project register.”  Project ideas can originate from a multitude of sources; most will germinate within the organization, but suggestions from external partners should also be considered.  Potential sources of project ideas include:
  • Product Engineering
  • Operations Management
  • Quality Management
  • Facility Management
  • Production personnel
  • Sales & Marketing
  • Customers
  • Suppliers
  • Community members
     As suggested by the diversity of sources of project ideas, the objectives of potential projects can vary widely.  General categories of potential project objectives include:
  • technological innovation
  • new product introduction
  • product enhancement (e.g. new feature)
  • increased productivity
  • cost reduction
  • quality improvement
  • sustainability, reduced environmental impact
  • increased customer satisfaction
  • warranty claim reduction
  • Good Will, “positive press,” or community standing
  • legal or regulatory compliance
Specific objectives that fall under these generic headings are too numerous to contemplate.  The list is provided only to spur ideation and is not comprehensive.

Screen.  Once a list of potential projects has been generated, a preliminary evaluation, or screening, can be conducted.  Another term used for this step is feasibility study or, in the vernacular, “sanity check.”  The purpose of screening is to eliminate from consideration those projects that “obviously” cannot be pursued at the present time.  This assessment is based, in large part, on preliminary estimates of resource requirements or expected results.  Screening criteria may include:
  • budget (i.e. funding required)
  • staffing (number or expertise) required
  • schedule (i.e. time required)
  • pending obsolescence of product, process, facility, or equipment
  • strategy alignment
  • regulatory requirements
     Project ideas that fail to pass the screening stage may remain on the project register, or backlog, to be revisited when conditions are more favorable.  Those that are deemed too fanciful, blatantly self-serving, or politically motivated may be stricken from the register to conserve resources that future review would consume and maintain a realistic and professional project register.
 
Configuration
     The configuration phase of project selection is where the organization defines how projects will be evaluated, compared, and selected or rejected.  It consists of three steps:  Choose Criteria, Choose Selection Method, and Evaluate Project Proposals.

Choose Criteria.  Selection criteria can be chosen from many options; they must align with the nature of the project and its objectives for the selection process to be effective.  Criteria selection will discussed further in the Project Selection Criteria section below.

Choose Selection Method.  Selection method options are also numerous.  Simple or small-scale projects may be sufficiently evaluated with a straightforward cost/benefit analysis, while larger, complex projects may require more sophisticated evaluations with input from a larger number of people.
     Some group decision-making techniques are discussed in Making Decisions – Vol. V.  The Analytic Hierarchy Process (AHP) (Making Decisions – Vol. III) and the Rational Model (Making Decisions – Vol. II) are also presented in previous posts.  A hybrid approach could also be used.  For example, the Delphi Method (see Making Decisions – Vol. V) could be used to determine the criteria and weighting factors to be used in AHP.
     The organization is free to choose or create a project selection method.  Whatever the choice, it is critical that the selection method be well-defined, objective, “tamper-resistant,” and clearly communicated to minimize the possibility – or suspicion – of bias or “gaming.”

Evaluate Project Proposals.  In this step, values are determined for each of the selection criteria for each project under consideration.  These are often presented in monetary terms to simplify investment justification, but need not be.  Other values that can be compared before and after project execution or between projects are equally valid.  Examples include first time yield (FTY), transactions per hour, equipment utilization, and so on.  These values will likely be translated to monetary terms at some point, but are perfectly acceptable for initial evaluations and comparisons.
 
Analysis
     The project selection process concludes with the analysis phase, where decisions are finalized.  This phase is completed in three steps:  Analyze Proposals, Analyze Risk, and Update Project Portfolio.

Analyze Proposals.  Using the selection method chosen, score, compare, or otherwise analyze each project proposal according to the criteria values determined above.  If a clear favorite is not identified, the selection method may need to be refined.  If sufficient capacity is available, refinement may be eschewed and two “tied” projects selected for execution.

Analyze Risk.  Any risks not fully accounted for in the chosen selection criteria and method should be considered before finalizing selection decisions.  Also, the risk of foregoing projects that were not selected – particularly if rejected by a slim margin – should also be considered.  Overriding selections made by the defined method must be justified and approved at a high level.  Justifications for overriding a selection decision may include:
  • Actions by competitors and the resulting market position each may occupy.
  • Confidence in estimates or probability of success.
  • Potential damage caused by project failure (financial, reputational, etc.).
  • The organization’s resistance to change.
  • The influence of uncontrollable factors.
Update Project Portfolio.  When the final selections have been made, the organization’s project portfolio should be updated.  The project(s) chosen for immediate execution are added to “active projects,” while those postponed are recorded in the project backlog.  The active projects must be prioritized to streamline resource allocation decisions; this is discussed further in the Other Project Selection Factors section below.
 
Project Selection Criteria
     Evaluation and selection criteria are chosen in the Configuration phase of the project selection process.  An organization can choose as many, or as few, criteria as are relevant to the projects it expects to execute.  Criteria based on unconventional metrics could also be used, so long as they are objective and can be utilized consistently.  All criteria, whether commonplace or unique, should be reviewed in light of project performance and results obtained.
     Potential selection criteria are numerous; many can be grouped in broad categories to organize information and facilitate configuration of the selection process.  Common categories and criteria include:
  • Financial Criteria
    • Return on Investment (ROI)
    • Net Present Value (NPV)
    • Internal Rate of Return (IRR)
    • Economic Value Added (EVA)
    • Loss Function
  • Operational Criteria
    • First Time Yield (FTY) or other quality metric
    • Equipment Downtime/Availability
    • Productivity
  • Resource Criteria
    • budget (i.e. funding required)
    • schedule (i.e. time required)
    • staffing required
  • Sustainability/Environmental Criteria
    • raw material usage
    • emissions, waste generated
    • energy consumption
  • Experiential Criteria
    • new employee onboarding
    • recent graduate experience
    • new project manager experience
    • knowledge to be gained (technology, process, equipment, software, etc.)
  • Relational Criteria
    • employee morale, turnover
    • Customer Satisfaction (e.g. Net Promoter Score [NPS])
  • Replicability (potential to leverage results in other contexts)
  • Good Will generation
      To simplify the decision model, several of the example criteria could be consolidated in financial metrics.  That is, operational, environmental, and other criteria could be defined in monetary terms and included in financial calculations.  Social pressure may preclude this practice, however; the significance of some criteria transcend financial metrics.  Also, choosing among projects with similar financial expectations is facilitated by the discrimination afforded by independent criteria.
 
Other Project Selection Factors
     A project selection process should be as objective as possible to achieve the most favorable results with the greatest efficiency.  However, this does not preclude the need for sound judgment and deep insight when making portfolio decisions, including project prioritization.
     The first level of prioritization is achieved via the project selection process described above.  High-priority projects are added to the “active” list, while others are relegated to the backlog.  Within the list of active projects, another level of prioritization must be defined.  The urgency of a new project may cause it to be prioritized above one already in progress.
     Limited resources may require the ongoing project to be placed on hold.  Review of project performance and updated forecasts may even result in project termination, freeing resources to support other projects in the portfolio.  The decision to terminate a project or place it on hold should not be made helter-skelter, however.  Incomplete projects incur costs without delivering benefits; restarting a stalled project can require more effort than it would have previously required to complete it.
     Other perils also lurk within portfolio management and project selection processes.  Self-serving, self-justifying, politically-motivated, or otherwise biased or unethical managers can influence decisions, leading to a suboptimal (to be charitable) project portfolio, if they are left to operate without “guard rails” (see Making Decisions – Vol. VII:  Perils and Predicaments).
     Undue influence, bias, inexperience, and other decision-altering factors and the negative consequences they breed can be minimized by operating according to a portfolio management standard.  An organization can provide the guard rails needed by adding requirements that are specific to project selection to its decision-making standard.  An outline of this standard is provided in the section titled “How should a decision-making standard be structured?” in Making Decisions – Vol. IV.  Additional guidance on portfolio management can be obtained from the Project Management Institute (PMI).
 
     Competent project selection requires multiple points of view – short- and long-term, financial and nonfinancial.  An organization that constructs a portfolio in which each project augments and amplifies the benefits of previous projects can expect better performance than competitors that use a less analytical project selection process.
 
     For additional guidance or assistance with project selection or other Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
References
[Link] “Project selection and termination--how executives get trapped.”  W.G. Meyer; Project Management Institute, 2012.
[Link] “Knowledge Contribution as a Factor in Project Selection.”  Shuang Geng, et al; Project Management Journal, February/March 2018.
[Link] “The ‘Everything is Important’ paradox:  9 practical methods for how to prioritize your work (and time).”  Jory MacKay; Rescue Time, May 5, 2020.
[Link] “Taguchi loss function.”  Six Sigma Ninja, November, 11, 2019.
[Link] “Everything You Need to Know about Project Selection.”  Kate Eby; Smartsheet, August 16, 2021.
[Link] “Use the Value Index to Prioritize Project Efforts.”  Carl Berardinelli; iSixSigma.
[Link] “Selecting the Best Business Process Improvement Efforts.”  J. DeLayne Stroud; iSixSigma.
[Link] “Use Point System for Better Six Sigma Project Selection.”  Drew Peregrim; iSixSigma.
[Link] “Project Selection: Don't Pan for Gold in Your Hot Tub!”  Gary A. Gack; iSixSigma.
[Link] “Black Belts Should Create Balanced Project Portfolios.”  William Rushing; iSixSigma.
[Link] “Select Projects Using Evaluation and Decision Tools.”  Rupesh Lochan; iSixSigma.
[Link] “PMP : Quantitative Approach to Selecting the Right Project.”  Abhishek Maurya; Whizlabs, March 20, 2017.
[Link] “A Guide to Project Prioritization and Selection.”  EcoSys Team; October 2, 2018.
[Link] “ISO 13053: Quantitative Methods in Process Improvement – Six Sigma – Part 1:  DMAIC Methodology.”  ISO, 2011.
[Link] Juran’s Quality Handbook.  Joseph M. Juran et al; McGraw-Hill.
[Link] “Project Prioritization Troubles? Brainstorm New Metrics.”  Barbara Carkenord; RMC Learning Solutions, September 8, 2021.
[Link] “Using Taguchi's Loss Function to Estimate Project Benefits.”  Michael Ohler; iSixSigma.
[Link] “The Right Decision.”  Douglas P. Mader; Quality Progress, November 2009.
[Link] “Trade-off Analysis in Decision Making.”  APQC, 2009.
[Link] “Building Leadership Capital – Action Learning Project Workbook.”  Deakin University, 2013.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Commercial Cartography – Vol. V:  Hazard Mapping]]>Wed, 12 Jan 2022 16:00:00 GMThttp://jaywinksolutions.com/thethirddegree/commercial-cartography-vol-v-hazard-mapping     An effective safety program requires identification and communication of hazards that exist in a workplace or customer-accessible area of a business and the countermeasures in place to reduce the risk of an incident.  The terms hazard, risk, incident, and others are used here as defined in “Safety First!  Or is It?
     A hazard map is a highly-efficient instrument for conveying critical information regarding Safety, Health, and Environmental (SHE) hazards due to its visual nature and standardization.  While some countermeasure information can be presented on a Hazard Map, it is often more salient when presented on a corollary Body Map.  Use of a body map is often a prudent choice; typically, the countermeasure information most relevant to many individuals pertains to the use of personal protective equipment (PPE).  The process used to develop a Hazard Map and its corollary Body Map will be presented.
     Hazard mapping originated with Italian auto workers’ attempts to raise awareness of workplace hazards in the 1960s.  A factory blueprint was adorned with circles of varying size and color, each representing a particular hazard and the risk associated with it.
     When this map was presented to management by the workers’ union, it was rejected.  The company claimed that it was unscientific and, therefore, unreliable.  Subsequent research by scientists, however, substantiated the workers’ empirical findings.
     To impart the greatest value, hazard mapping efforts, and the entirety of the safety program, should employ both scientific (data collection and analysis) and empirical (observations and perceptions) methods.  When findings from both methods correlate, management support for investment in safety improvements should be forthcoming.  When they do not, further investigation may be necessary to ensure that all hazards are receiving the attention they warrant, be it from management or from imperiled individuals.
     Hazard maps come in many forms, as there is no single, widely-accepted standard.  Various organizations have established hazard classification schemes, risk rating scales, color codes, and symbology to be used in the creation of their hazard maps.  Standardization within an organization is critical to effective communication to employees, contractors, and others throughout its facilities.  Standardization across organizations is highly desirable, but has not yet come to fruition.
     The next section proposes a hazard classification and risk rating scheme targeting four key characteristics of an effective communication tool.  Such a tool should be:
  • consistent.  Providing a single communication standard ensures that all affected individuals can quickly comprehend the information presented, no matter where they venture within an organization.
  • visual.  Use of colors and symbols facilitates rapid assimilation of critical information, while greater detail is provided in accompanying text.
  • universal.  To the extent possible, information is clearly conveyed, irrespective of the native language or expertise of the reader.
  • compatible.  A communication standard that parallels those used for other purposes within the organization can be rapidly adopted without inducing misinterpretation.
 
The JayWink Standard
     The standard proposed here exhibits the four key characteristics described above.  Consistency is achieved by describing each hazard category and risk severity category such that practitioners can easily differentiate between them.
     There are four risk severity categories that encompass a 1 – 10 risk rating scale, as follows:
  • Low (1 – 2)
  • Moderate (3 – 5)
  • High (6 – 8)
  • Extreme (9 – 10)
There are five hazard classifications, defined as follows:
  • Physical Safety – acute injuries caused by accidents.  Many are visible, such as lacerations; others may be less obvious, such as contusions or electrical shock.
  • Chemical/Radiation Exposure – exposure to dangerous chemicals or radioactive substances caused by a spill or other release.  Burns, respiratory distress, or other acute injury may result.  Chronic health issues may also result from exposure, persisting long after the obvious injuries have healed.
Physical Safety and Chemical/Radiation Exposure comprise the supercategory of Safety Hazards.
  • Ergonomic Factors – chronic health issues that result from repetitive tasks or conditions that can cause injury, though not generally considered accidents.  Lack of sufficient access space, lifting aids, or lighting, and excessive noise are common examples.
  • Stressors – psychological factors that influence an individual’s well-being.  Examples include unreasonable or disrespectful supervisors, uncooperative coworkers, time pressure, and lack of security.  These factors can effect performance so severely that other risks are amplified.  For example, an individual’s stress-related loss of focus could lead to a mistake that results in a chemical spill or physical injury.
Ergonomic Factors and Stressors comprise the supercategory of Health Hazards.
  • Environmental Hazards – long-term impacts on natural systems; societal cost.  Any potential contamination of air, water, or soil and consumption of natural resources are included in this category.
     A description of each risk severity category, corresponding to each hazard classification, is provided in the summary table in Exhibit 1.  Examples of hazards in each classification, or category, are provided in Exhibit 2.
     The visual nature of the standard is partially revealed in Exhibit 1; the color codes for hazard classifications and risk severity categories are shown.  Risk severity categories are presented as follows:
  • Low:  Green
  • Moderate:  Yellow
  • High:  Orange
  • Extreme:  Red.
The hazard classification color code is as follows:
  • Physical Safety:  Red
  • Chemical/Radiation Exposure:  Yellow
  • Ergonomic Factors:  Blue
  • Stressors:  Black
  • Environmental Risks:  Green.
The color codes are used in conjunction with standard symbols that represent the hazard classification supercategories, as shown in Exhibit 3.  Use of the composite symbols to construct a Hazard Map will be demonstrated in the next section.
     The standard maintains universality, in part, by incorporating standard, recognizable symbols defined in ISO 7010 Graphical symbols – Safety colours and safety signs – Registered safety signs; the global standard provides international users a common resource.  Limiting the number of hazard classifications and symbols used facilitates broad application in a wide variety of environments.  This standard is equally applicable to manufacturing operations and service industries of all types, in both front-office and back-office settings.  It can also be applied to government buildings, community centers, parks, or other publicly-accessible areas to facilitate protection of patrons.
     The hazard mapping standard is highly compatible with other communication tools in common use.  The 1 – 10 scale is similar to the severity, occurrence, and detection scales used in FMEA.  The four color-coded risk severity categories are analogous to the Homeland Security Advisory System (replaced in 2011) that warned the USA of terrorist threats.  The hazard classification color code also incorporates broadly-recognized implications – red and yellow are commonly associated with safety concerns, while green is “the” color representing matters of concern for the environment.
 
Creating a Hazard Map
     To create a Hazard Map, begin with a visual representation of the area under scrutiny, such as a layout, as discussed in Commercial Cartography – Vol. III.  If a layout has not been created, a sketch, photo, or other image may suffice, provided sufficient detail can be shown and hazard locations are accurate enough to effectively manage them.  Larger facilities may require several Hazard Maps to present all required information clearly.  For example, each department of a manufacturing facility may have its own map.  A small retail outlet, on the other hand, may be sufficiently documented on a single map.
     As is the case with many Operations documents, Hazard Maps often proceed through several draft stages before they are published.  A moderately complex map may be developed, for example, in the following four stages.
     Creation of a first draft begins by adding hand-written notes to a layout drawing.  A small group of knowledgeable individuals can brainstorm the area’s hazards, completing this step quickly.  The next step is to collect information for the map’s legend.  Hand-written notes are also recommended here to maintain the flow of information in the group.
     Convert the hand-written notes into hazard descriptions and classifications; evaluate each and assign a risk rating number.  Create a composite symbol representing each hazard identified and add it to the layout as shown in Exhibit 4.  A hazard symbol can be placed on a map to identify the location of a hazard as an entire area or a specific machine, work cell, process line, etc.  In the Simple Hazard Map Example, the first two hazards use leaders to identify the location as the “Chemical Store,” while the other locations are more general.  The less-specific locations can be interpreted as “approximate,” “the area surrounding the symbol,” or other similar statement.  If greater precision is required than can be made obvious on the layout, greater detail should be included in the hazard description on the map legend, such as a machine ID.
     Complete the Hazard Map Legend for the hazards identified, adding any missing information on countermeasures and reaction plans and color-coding the hazard and risk information columns, as shown in Exhibit 5.
     A second draft is created by conducting an onsite review to verify information included in the first draft and identify new hazards to be added.  The first two drafts are created, primarily, by scientific methods; thus, any SHE data available should be incorporated.
     The third draft is created by incorporating the empirical data collected from “front-line” personnel.  Observations and perceptions that have not previously been recorded, in SHE data or otherwise, must be evaluated and properly addressed.  Legitimate hazards are added to the map and put “on the radar” for development of additional protections or elimination strategies.  Unfounded concerns should be alleviated through education, but should never be ignored.
     A final review by all concerned parties should result in a Hazard Map approved for publication.  To be complete, a Hazard Map must include three components:
(1) the layout with composite symbols identifying the hazard location and basic information (Exhibit 4);
(2) the legend table containing detailed hazard information (Exhibit 5); and
(3) an explanation of the composite symbols (Exhibit 3 and Exhibit 5).
 
     Hazard Maps are most valuable when they are collaboratively developed, prominently displayed for easy reference, and all personnel have a working understanding of the information presented on them.  Large facilities with multiple hazard maps may find it useful to compile data from all maps into a “Master Legend,” creating a single reference for hazards throughout the entire plant.  A Master Legend can be sorted in various ways to facilitate resource allocation decisions for SHE projects.  For example, grouping by hazard classification may reveal opportunities to improve several areas with a single project or provide background data for OSHA partner programs or ISO 14001 initiatives.  Also, sorting by risk rating can facilitate prioritization of projects among disparate areas of the facility.
     Hazard Maps – and the underlying data and perceptions – should be periodically reviewed and updated.  Process development, equipment upgrades, risk mitigation projects, organizational changes, and other factors could cause significant changes in the hazard and risk profiles of a facility.  Hazard Maps must reflect current conditions to be effective.
 
Creating a Body Map
     Human nature ensures that individuals will focus, predominantly, on their immediate well-being and somewhat less on their long-term health.  Environmental issues, while important, tend to enjoy less mindshare than more proximate causes of distress.  This tendency will direct attention toward the countermeasures recorded in Hazard Map Legends.  These are the mechanisms by which individuals protect themselves from hazards.  The most salient – and easiest to implement reliably – is often PPE; proper use of PPE is critical to a safe environment.  A simple and effective tool to support proper use is a PPE Body Map, or simply Body Map.
     Several options exist for presenting required PPE on a body map.  The example in Exhibit 6 uses a generic outline of a human form with standard symbols (ISO 7010) placed near the part of the body to be protected.
     Exhibit 7 provides a more relatable image of a person, but some ambiguity may remain (“Is that a fire suit?”).  The image in Exhibit 8 is more realistic, conveying more information visually than the previous example.  Text provides additional information to aid understanding of the types of hazards from which each protects the user.  The symbols, however, may not be universally understood, particularly by international associates.
     The most information may be conveyed with the least effort when standard symbols and detailed text accompany a photo of person wearing the PPE described.  This is the recommended format of Body Map; an example is shown in Exhibit 9.  Any text added to the body map should be succinct and critical to proper use of PPE.  Examples of appropriate text include specifications (e.g. ANSI Z87.1 for safety glasses, filtration level requirements for respirators) or additional requirements (e.g. cut-resistance rating for gloves).
     The Body Map can be further augmented, if desired, with direct references to the Hazard Map Legend.  For example, the composite symbol “flags” can be reproduced adjacent to the PPE identifiers.  Some examples of this practice are shown in Exhibit 9.  Doing so may reinforce users’ understanding of the Hazard Map and the importance of proper PPE.  In turn, this could facilitate training and reduce the effort required to monitor and enforce proper PPE use.
Body Map developers must weigh the benefits of additional information against the risk of information overload – sometimes, less is more.  Publish Body Maps that will provide the greatest benefit to individuals and the organization – ones that will be referenced regularly and understood thoroughly.
     As mentioned in the introduction, a Body Map is a corollary to – not a component of – a Hazard Map.  It is, however, a relatively simple step that adds value to the safety program by translating hazard information into an easily-understood format that helps individuals protect themselves.
     Like Hazard Maps and other documents, Body Maps should be subject to periodic review to verify that current information is provided and individuals are adequately protected.  Overlaying incident data on a PPE Body Map can be an effective aid to evaluating current protection measures.
 
     Hazard maps and body maps are valuable additions to an organization’s SHE toolbox.  The information assembled to create them supports improvement initiatives on several fronts; creativity and insight of practitioners may reveal additional opportunities to leverage these tools across their organizations.
 
     For additional guidance or assistance creating Hazard Maps for your organization’s facilities or other Operations challenges, feel free to leave a comment, contact JayWink Solutions, or schedule an appointment.
 
     For a directory of “Commercial Cartography” volumes on “The Third Degree,” see Vol. I:  An Introduction to Business Mapping.
 
References
[Link] “Hazard Mapping Reduces Injuries at GE.”  Communications Workers of America; December 2, 2009.
[Link] RiskMap.com
[Link] “ISO 7010:2019(en)  Graphical symbols — Safety colours and safety signs — Registered safety signs”
[Link] “Design and analysis of a virtual factory layout,”  M. Iqbal and M.S.J Hashmi.  Journal of Materials Processing Technology; December 2001.
[Link] “Citizen Guidance on the Homeland Security Advisory System.”  Ready.gov.
[Link] “Coloring the hazards: risk maps research and education to fight health hazards,” J. Mujica.  American Journal of Industrial Medicine; March 1992.
[Link] “Hazard Mapping.” NJ Work Environment Council.
[Link] “Using Workplace and Body mapping tools.”  Irish Congress of Trade Unions.
[Link] “Slips and trips mapping tool - An aid for safety representatives.”  Health and Safety Executive, UK.
[Link] “Using Hazard Maps to Identify and Eliminate Workplace Hazards: A Union-Led Health and Safety Training Program,” Joe Anderson, Michele Collins, John Devlin, and Paul Renner.  NEW SOLUTIONS - A Journal of Environmental and Occupational Health Policy; September 2012.
[Link] “Job Hazard Analysis.”  Occupational Safety and Health Administration, U.S. Department of Labor; 2002.
[Link] “Health and Safety in the Restaurant Industry.”  Interfaith Worker Justice; 2011.
[Link] “Mapping out work hazards.”  centrepages; 1997.
[Link] “Slips, trips and falls mapping.”  Health and Safety Authority, Ireland; 2014.
[Link] “Mapping.”  Tools for Barefoot Research, International Labour Organization; 2002.
[Link] “Risk-Mapping.”  DC 37 Safety & Health Factsheet, District Council 37, New York, New York.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Learning Games]]>Wed, 30 Dec 2020 15:30:00 GMThttp://jaywinksolutions.com/thethirddegree/learning-games     Training the workforce is a critical responsibility of an organization’s management.  Constant effort is required to ensure that all members are operating according to the latest information and techniques.  Whether training is developed and delivered by internal resources or third-party trainers, more efficacious techniques are always sought.
     Learning games, as we know them, have existed for decades (perhaps even longer than we realize), but are gaining popularity in the 21st century.  Younger generations’ affinity for technology and games, including role-playing games, makes them particularly receptive to this type of training exercise.  Learning games need not be purely digital, however.  In fact, games that employ physical artifacts have significant advantages of their own.
     Several terms may be encountered when researching learning games.  Most include “game,” “simulation,” or a combination of the two.  Broadly-accepted definitions describe games as activities that include an element of competition, while simulations attempt to reproduce aspects of the real world.  For simplicity and brevity, this post will use learning games as an umbrella term covering all variants, including simulation games, serious games, serious simulation games, and so on.
     Learning games have been developed for a variety of topics, while others could benefit from creative developers’ attention.  This discussion focuses on typical Third Degree topics, such as engineering, operations and supply chain management, business strategy, and project management in post-secondary and professional education.
     Learning games can take many forms.  Simulations of critical processes are made as realistic as possible to convey the gravity of the real-world situation it represents and teach participants how to manage the risk involved in order to make prudent decisions.  Games can also teach certain concepts or proper judgment through fantastical presentations; this approach can be particularly useful when faced with a reluctant, unreceptive audience.
     Learning games can be digital, analog, or hybrid.  Computer games are popular with tech-savvy students that are accustomed to sophisticated programs and high-quality graphics.  Highly-sophisticated versions may even employ virtual reality or augmented reality.  Use of highly accurate digital twins can improve a game’s effectiveness.
     Tabletop games utilize physical space and objects in game play.  The layout of a game need not be limited to the size of a table; an entire room can be mapped as a gameboard, allowing people, as well as objects, ample space to move about.  The physical representation of the learning game can be used to impart lessons more tangibly than displays and numbers can achieve.  Concepts of inventory and transportation waste are examples of this type of application.  Hybrid games employ digital and physical elements, seeking the most effective combination to aid learning and retention.
     The level of interaction participants have with a learning game is another important characteristic to consider.  Observational games present participants with a situation and a fixed data set.  The data can be analyzed in various ways and other queries may be possible.  From these analyses, participants draw conclusions and make decisions that are then critiqued by facilitators.  Observational games can be thought of as enhanced case studies; they are somewhat more interactive, and may employ multimedia technology or other embellishments for a more appealing and engaging presentation.
     Experimental games, on the other hand, are highly interactive, allowing participants to modify elements of the game and directly assess the impacts of their decisions on system or process performance.  Multiple analyses can be performed in search of an optimal solution.  The ability to manipulate the system and receive feedback on the effects of each change often leads to deeper understanding of the systems and processes simulated.  The resulting competence improves safety and efficiency of the real-world counterparts when the students become managers.
     Learning games can also be categorized according to their level of complexity.  One such taxonomy [Wood, 2007] includes insight games, analysis games, and capstone games.  Wood’s taxonomy of games is summarized in Exhibit 1Insight games seek to develop understanding of basic concepts and context required for students to comprehend subsequent material and advanced concepts.
     Students develop required skills “through iterations of hypothesis, trial, and assessment” [Wood, 2007] in analysis games.  These games seek to bridge the gap between understanding a concept and performing a related task effectively.  Wood cites the example of riding a bicycle; a student may understand the task by reading its written description, but this does not ensure a safe ride upon first attempt.  Practice is needed to match ability with understanding.
     Capstone games, as you may have guessed, require participants to consider multiple objectives or perspectives.  These games incorporate multiple disciplines in a single game to simulate the complexity of decisions that real-world managers must make.  When conducted as a team exercise, with members of varying background and experience, capstone games can provide a very realistic approximation of situations faced by managers on a regular basis.
Stages of Game Play
     Learning games typically proceed in three stages:  preparation, play, and debriefing.  Players and facilitators bear responsibility for a successful game in each stage.  As is the case with any type of training, engagement of participants is critical to success.
     In the preparation stage, facilitators are responsible for “setting the stage” for game play.  This may include preparing presentations to explain the rules of the game, assigning students to teams within the group, or stocking the physical space with required materials or accommodations.  Players are required to review any information provided in advance and procure any material they are expected to provide.
     During game play, players are expected to remain engaged, maximizing the learning benefit for themselves and others through active participation and knowledge-sharing.  Facilitators enforce rules, such as time limits, and may have to “keep score” as the game progresses.  Facilitators also answer questions, provide guidance to ensure a successful game, and monitor the proceedings for improvement ideas.
     An effective debriefing is essential to a successful learning game.  It is in this stage that participants’ performance is evaluated.  Critiques of decisions made during the game provide participants with valuable insights.  In many game configurations, this is where the greatest learning occurs; it provides an opportunity to learn from the experience of other teams, including situations that may not have occurred in one’s own game.  Facilitators are responsible for providing information participants may need in order to understand why certain decisions are better than others.  Players may assist facilitators in providing critiques and explanations, ensuring that all participants develop the understanding necessary to apply the new information to future real-world scenarios.
 
Benefits of Learning Games
     Learning games provide many benefits to players and the organizations that employ them.  Advantages include the number of people that can be effectively trained, the time and expense required for training, and the risk of poor performance.  The advantages of learning games relative to on-the-job-training is summarized in Exhibit 2, where the “real world” is compared to a learning game environment.
     Learning games may also offer additional benefits related to onboarding, team-building, or unique aspects of your organization.  Consider all possible benefits to be gained when evaluating learning games for your team.
 
Example Learning Games
     While some organizations may use proprietary learning games, many are widely distributed, often for free or at very low cost.  Several learning games are cited below, but only to serve as inspiration.  Assessment of each should be conducted in the context of the group to be trained.  Therefore, the time and space required to provide a thorough review of each would provide little value.  Also, doing so may encourage readers to limit their choices to those games mentioned here which is contrary to the objective of this post.

Physics/Engineering:  Whether your interest is in equations of motion or medieval warfare, the Virtual Trebuchet is for you.  Players define several attributes of the trebuchet, launch a projectile, and observe its trajectory.  Peak height, maximum distance, and other flight data are displayed for comparison of different configurations.  Visually simple, highly educational, and surprisingly fun, even the non-nerds among us can appreciate this one.
Project Management:  The Project Management Institute (PMI) Educational Foundation has developed the Tower Game for players ranging from elementary school students to practicing professionals.  Teams compete to build the “tallest” tower in a fixed time period with a standard set of materials.  “Height bonuses” are earned through resource efficiency.  The game can be customized according to the group’s experience level.
Operations Management:  Considering the complexity of operations management, it is no surprise that these games are among the most sophisticated.  OMG!, the Operations Management Game, is a tabletop game that maps a production process with tablemats for each step.  Each step is represented by a single player; the number of steps can be varied to accommodate groups of different size.  Physical artifacts represent work-in-process (WIP) and finished goods inventories, and dice are used to simulate demand and process variability.  Upgrades are available when sufficient profits have been retained to purchase them.  Many important aspects of operations management are included in this game; it could be a valuable learning tool.
            A camshaft manufacturing process is modelled in Watfactory, a game used to study techniques of variation reduction.  It includes a large number of variables – 60 variable inputs, 30 fixed inputs, and 3 process step outputs – and several investigation options that define data analyses to be performed.  This one is not for beginners or the faint of heart, but is a solid test of skills.
Supply Chain Management:  Revered as the granddaddy of all learning games, the Beer Game was first developed at MIT in the 1960s to explore the bullwhip effect in supply chains.  Since then, many alternate versions have been developed, including virtual ones.
Business StrategyBizMAP is a game that can be used to assess an individual’s aptitude for entrepreneurship or suitability for an executive leadership role.  If one trusts its predictive capability, it could be an extremely valuable aid in averting disasters.  Poor executive decision-making or an ill-advised decision to quit one’s day job can be avoided.
     Many other learning games are available with differing objectives, configurations, and complexity.  Other professional practices, including Managerial Accounting (!) and bridge design can be explored using learning games.  Explore and be amazed!
 
Serious Business
     In some circles, learning games have become serious business.  High-stakes decisions are often based on simulations.  Ever heard of War Games?  The accuracy and reliability of such simulations is a serious matter indeed.
     Less costly in terms of human life, but potentially catastrophic in financial terms, businesses may simulate the competitive landscape in which they operate.  If invalid assumptions are made, or the simulation otherwise misrepresents the competitive marketplace, decisions based on it could be financially ruinous.
     The development of the learning games industry reflects just how serious it has become.  Industry conferences, such as the Serious Games Summit and the Serious Play Conference, are held for serious gamers to share developments with one another.  A professional association – Association for Business Simulation and Experiential Learning (ABSEL) – has also been chartered to serve this growing community.
            In the early 2000s, MIT embarked on its Games to Teach Project, a collaboration aimed at developing games for science and engineering instruction.  Decades after launching the movement, MIT’s ongoing commitment to learning games is reflected in the Scheller Teacher Education Program, in which these tools play a prominent role.
 
     No matter your field of study or level of experience, chances are good that a learning game has been developed for your target demographic.  However, improvements can always be made.  If you are an aspiring programmer or game developer, a learning game is an excellent vehicle for demonstrating your skills while providing value to users.  It will look great on your resume!
 
     If you have a favorite learning game, or an idea for a new one, please tell us about it in the comments section.  If you would like to introduce learning games to your organization, contact JayWink Solutions for additional guidance.
 
References
[Link] “The Role of Computer Games in the Future of Manufacturing Education and Training.”  Sudhanshu Nahata; Manufacturing Engineering, November 2020.
[Link] “Online Games to Teach Operations.”  Samuel C. Wood; INFORMS Transactions on Education, 2007.
[Link] “Game playing and operations management education.”  Michael A. Lewis and Harvey R. Maylor; International Journal of Production Economics, January, 2007.
[Link] “A game for the education and training of production/ operations management.”  Hongyi Sun; Education + Training, December 1998.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Safety First!  Or is it?]]>Wed, 16 Dec 2020 15:30:00 GMThttp://jaywinksolutions.com/thethirddegree/safety-first-or-is-it     Many organizations adopt the “Safety First!” mantra, but what does it mean?  The answer, of course, differs from one organization, person, or situation to another.  If an organization’s leaders truly live the mantra, its meaning will be consistent across time, situations, and parties involved.  It will also be well-documented, widely and regularly communicated, and supported by action.
     In short, the “Safety First!” mantra implies that an organization has developed a safety culture.  However, many fall far short of this ideal; often it is because leaders believe that adopting the mantra will spur the development of safety culture.  In fact, the reverse is required; only in a culture of safety can the “Safety First!” mantra convey a coherent message or be meaningful to members of the organization.
Safety Vocabulary
     All those engaged in a discussion of safety within an organization need to share a common vocabulary.  The definitions of terms may differ slightly between organizations, but can be expected to convey very similar meanings.  The terms and descriptions below are only suggestions; each organization should identify all terms necessary to sustain productive discussions and define them in the most appropriate way for their members.
     An organization with a well-developed safety culture will be relentless in identifying hazards in the workplace.  A hazard is an environment, situation, or practice that could result in harm to an individual or group.  Examples include elevated work platforms, energized electrical equipment, and chemical exposure.  Many hazards exist throughout the typical workplace; each should be evaluated for the feasibility of elimination.
     Risk is the likelihood that a hazard will cause harm and the extent or severity of that harm.  “High-risk” endeavors are usually considered those that exhibit a high probability of severe or extensive harm.  Defining categories of risk can aid in prioritizing mitigation and elimination efforts.
     An accident is an occurrence of harm – usually physical – to an individual or group.  The effects of an accident may be immediate, as in the case of a fall, or protracted, as in the case of radiation exposure.  A near miss is an occurrence that could be expected to cause harm, but the individuals involved are fortunate enough to avoid it.  An electrical arc within an open panel that does not seriously shock the technician working on it is a near miss that demonstrates the importance of personal protective equipment (PPE).
     PPE is the last line of defense against injury.  It is a broad term that includes any device used to protect an individual from harm, including safety glasses, ear plugs, hard hats, insulating gloves, steel-toed shoes, cut-resistant garments, face shields, retention harnesses, and breathing apparatus.  PPE that is adequately specified and properly worn can reduce the risk of an activity, but cannot eliminate it.  It can prevent or reduce injury in many cases, but provides no guarantee.  At best, PPE can turn an accident into a near miss.
     Accidents, near misses, and property-damage events are often called, collectively, incidents.  Use of this term is convenient for ensuring an investigation occurs, regardless of the type of event.  The difference between a near miss or property-damage event and an accident is often pure coincidence or good fortune.  These are not reliable saviors; they should not be expected at the next incident.
     Dangerous behavior includes activities that can be expected to result in an incident.  Examples include driving a loaded forklift too fast without checking for traffic at intersections, failing to remove combustible material from an area before cutting or welding equipment is used, and carelessly handling dangerous chemicals in open containers.  Behaviors develop for various reasons; it could signal a lack of training, complacency, or malicious intent.  In any case, corrective action must be taken to correct the behavior before an incident occurs.
 
Elements of Safety Culture
     The steps required to develop a safety culture are not particularly difficult to understand.  They may be difficult to implement, however, because they require a deep commitment of decision-makers to prioritize safety over other concerns.  When pressured to meet production requirements or cost-reduction targets, managers can be tempted to abandon safety-focused initiatives that they perceive as threats to the attainment of other metrics.  Commitment at the highest levels of an organization is required to prevent the sacrifice of safety to competing objectives.
     A common theme among discussions of organizational programs is documentation.  Developing a culture of safety, like so many other initiatives, requires a significant amount of documentation.  The value of documentation transcends its functional attributes; it provides evidence of commitment of the organization’s leadership to treat safety as its highest priority.  Documentation replaces verbal attestations and platitudes with “rules of the road,” or expectations of conduct at all levels of the organization.
     Documentation can become extensive over time.  At a minimum, it should include:
  • Hazard identification:  descriptions of exactly what makes an operation, machine, place, etc. potentially dangerous.
  • Risk assessment:  for each hazard, evaluate the likelihood and severity or extent of harm.
  • PPE required:  specify the personal protective equipment prescribed to protect individuals from known hazards.
  • Training requirements:  define all training required for an individual to remain safe from known hazards.
  • Work instructions:  detailed procedures to ensure safe practices in routine work.
  • Maintenance instructions: detailed procedures to ensure safe practices in non-routine work.
  • Assessments:  reviews and evaluations of program effectiveness.
Additional documentation may be deemed necessary as the culture matures.  Specifically, conducting safety program assessments may reveal the need for additional procedures or other refinements of the documentation package.
     Several other characteristics must be present to sustain a successful safety culture; chief among them is effective communication.  Reports of incidents and safety metrics must be viewed as communication vehicles, not the sole required output of a safety program.  This means that incidents are investigated, not merely reported.  Root causes must be found, addressed, and communicated, updating members’ understanding of the hazards they face.
     Communication must also be open in the opposite direction.  Team members should be provided a clearly-defined channel for communicating safety concerns and suggestions to those responsible for implementing changes.  This hints at another critical element of safety culture – security.  If team members fear reprisal for reporting issues, safety and morale will suffer.  Each member should be encouraged to activate this communication channel and given the confidence to do so whenever they see fit.
     All incidents – accidents, near misses, and property-damage events – should be investigated with equal intensity.  Non-accident incidents are simply precursors to accidents; thorough investigation provides an opportunity to prevent future accidents or other incidents.  Open communication and investigative responses are needed to ensure that incidents – near misses, in particular – are reported.
     Non-routine work is a significant source of workplace incidents.  When maintenance and repair personnel are rushed to release equipment, the risk of injury increases proportionally to the stress they feel.  Failing to provide ample time to perform non-routine tasks carefully and thoroughly increases risk to maintenance and operations personnel.
     In their haste to return equipment to service, technicians may take shortcuts or fail to strictly adhere to all prescribed safety procedures, increasing the risk of an incident.  Once the equipment is returned to service, shortcuts taken could reduce its reliability, precipitating a catastrophic failure.  Such a failure may cause injury of operators or a near miss in addition to property damage.  A culture of safety allows sufficient time for non-routine work to be performed carefully and to be thoroughly tested before equipment is returned to service.
     Similarly, excessive time pressure on investigations of issues, training, or production can lead to stress, shortcuts, and distraction.  All of these considerably increase the probability of an incident occurring.  No one should have to rely on luck to avoid injury because the pressures to which they are subjected make an incident nearly inevitable.
     In a mature safety culture, culpability of individuals will be assessed according to the type of error committed.  If a blameless error – one that “could happen to anyone” – is committed, or a design flaw that invites misinterpretation is discovered, efforts should be made to mistake-proof the system (see “The War on Error – Vol. II:  Poka Yoke”).  If dangerous behavior or intentional violations cause an incident, disciplinary action may be taken.  Other considerations include an individual’s safety record, medical condition, and drug (prescription, OTC, or illicit) use.
     To ensure consistent treatment of all personnel (no playing favorites), an evaluation tool, such as Reason’s Culpability Decision Tree, shown in Exhibit 1, can be used.  Administered by an impartial individual or small group, the series of questions will lead to consistent conclusions.  Responding to each type of error with predefined and published actions will further support the team’s perception of procedural justice.
     While Reason’s Culpability Decision tree will likely lead to a majority of responses requiring no punitive action, it is preferable to a “no blame” system.  A no blame system provides little opportunity to correct patterns of behavior, even when negligent or reckless.
     A culture of safety espouses a continuous improvement mindset.  In this vein, periodic reviews should be conducted to evaluate the effectiveness of the systems in place to ensure safety.  Over time, equipment degrades and is often modified or upgraded, changing the characteristics of its safe operation and maintenance.  Operators and technicians come to recognize previously unidentified hazards and may develop well-intentioned work-arounds.  A periodic review reminds personnel of the communication channels available to them and allows the documentation and procedures to be updated to reflect the current condition of the system.
     Results of routine health screenings should be included in these reviews to assess the effectiveness of prescribed PPE and documented procedures.  In addition to vision and hearing checkups, individuals should be examined for signs of repetitive stress or vibration-induced disorders.  They should also have the opportunity to discuss their stress levels and overall well-being.  It may be possible to identify negative trends before conditions become debilitating.  Preventive measures can then be implemented to safeguard employees from injury.  One of the simplest responses to many work-related issues is to schedule task rotations.  Doing so reduces the risk of repetitive stress disorders and errors caused by complacency or boredom.
 
     Organizations that have mature safety cultures prioritize the well-being of their employees, customers, and community.  The evolution of an organization can start with one person.  You shouldn’t expect it to be easy or fast, but doing what’s right is timeless.
 
     For assistance in hazard identification, risk assessment, developing procedures, or other necessities of safety culture, leave a comment below or contact JayWink Solutions directly.
 
References
[Link] “A Roadmap to a Just Culture:  Enhancing the Safety Environment.”  Global Aviation Information Network (GAIN), September 2004.
[Link] The 12 Principles of Manufacturing Excellence.  Larry E. Fast; CRC Press, 2011.
[Link] “The health and safety toolbox.”  UK Health and Safety Executive, 2014.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[The War on Error – Vol. IX:  Taguchi Loss Function]]>Wed, 02 Dec 2020 15:30:00 GMThttp://jaywinksolutions.com/thethirddegree/the-war-on-error-vol-ix-taguchi-loss-function     Choosing effective strategies for waging war against error in manufacturing and service operations requires an understanding of “the enemy.”  The types of error to be combatted, the sources of these errors, and the amount of error that will be tolerated are important components of a functional definition (see Vol. I for an introduction).
     The traditional view is that the amount of error to be accepted is defined by the specification limits of each characteristic of interest.  Exceeding the specified tolerance of any characteristic immediately transforms the process output from “good” to “bad.”  This is a very restrictive and misleading point of view.  Much greater insight is provided regarding product performance and customer satisfaction by loss functions.
     Named for its developer, the renowned Genichi Taguchi, the Taguchi Loss Function treats quality as a variable output.  In contrast to the “goal post” philosophy described above, which uses a digital step function, the Taguchi Loss Function quantifies quality on an analog scale.
     According to goal-post philosophy, characteristic values that fall inside the tolerance band by the slightest amount are acceptable (no quality loss) and those that fall outside the tolerance band by the slightest amount are unacceptable (total loss).  This is shown conceptually in Exhibit 1.  Recognizing that the difference in quality between these two conditions was quite small, Taguchi realized that quality at any point in the characteristic range could be expressed relative to its target value.  Furthermore, he concluded that any deviation from a characteristic’s target value represents a corresponding reduction in quality with a commensurate “loss to society.”  The Taguchi Loss Function is shown conceptually in Exhibit 2.
Sources of Loss
     Many struggle to understand Taguchi’s characterization of variable quality as loss to society.  It is, however, a clear connection once given proper consideration.  Resources consumed to compensate for reduced quality are unavailable for more productive use.  Whether the effects on consumers and providers are direct or indirect, each incurs a loss.
     Sources of loss are varied and may not be immediately obvious, in large part due to the paradigm shift required to transition from goal post thinking.  The most salient loss to providers comes in the form of scrap and rework.  Material, labor, and time losses are easily identifiable, even by goal-post thinkers.  Somewhat more difficult to capture fully, warranty costs are also salient to most providers.  Lost productive capacity, reverse logistics, and product replacement costs can be substantial.  Worse, a damaged reputation and subsequent loss of good will in the marketplace can result in lost sales.  Some existing customers may remain loyal, but attracting new customers becomes ever more difficult.  The difficulty calculating this loss accurately further compounds the problem; misattribution of the cause of slumping sales will make a turnaround nearly impossible.
     Consumers experience losses due to reduced product or service performance.  If the time required to receive service or to use a product increases, this is a direct cost to the consumer.  Reduced performance lowers the value of the product or service and may prompt consumers to seek alternatives.
     A product that requires more frequent maintenance or repair than anticipated incurs losses of time, money, and productivity for consumers.  This could, in turn, lead to warranty costs, reputational damage, and lost sales for the producer.
     The examples cited are common losses and are somewhat generic for broad applicability.  Careful review of any product or service offered is warranted to discover other sources of loss.  Safety and ergonomic issues that could lead to injury and liability claims should be front-of-mind.  Environmental impacts and other regulatory issues must also be considered.  There may be other sources of loss that are unique to a product type or industry; thorough consideration of special characteristics is vital.
 
Nominal is Best
     The most common formulation of the Taguchi Loss Function is the “nominal is best” type (see Exhibit 2).  Typical assumptions of this formulation include:
  • Target (“best”) value is at center of tolerance range.
  • Loss = 0 at target value.
  • Characteristic values are normally distributed.
  • Out-of-spec conditions are reliably captured.
  • Failures are due to compound effects (i.e. each characteristic is within its specified range, but the combination is unsatisfactory).
     For product manufacturing, the scrap value may be used to calculate the loss function.  If out-of-tolerance conditions can escape, or other conditions exist that would cause greater losses, the calculation can be modified accordingly.  Also, service providers can define the maximum loss as is most appropriate for them, such as repair cost, loss of future sales, etc.  Whatever definition is chosen, it should be applied consistently to allow for comparisons across time, products, or service offerings.  Using the scrap value of a product provides consistency that may be difficult to achieve with other definitions of maximum loss.
     The nominal is best Taguchi Loss Function is a quadratic function of the form:
            L(x) = k (x – t)^2, where:
  • L(x) is the loss experienced at characteristic value x;
  • k is the loss coefficient, a proportionality constant;
  • x is the measured (actual) characteristic value;
  • t is the target characteristic value.
     To calculate k, using the assumptions stated above, set x equal to the maximum allowable value of the bilateral tolerance (USL) and L(x) equal to the maximum loss (i.e. scrap cost).  Solve for k by rearranging the loss function expression:  k = L(x)/(x – t)^2.
     With a known k value, the loss incurred by the deviation from the target characteristic value can be calculated for each occurrence.  Total and average loss/part can then be calculated and used to analyze process performance.
     For a visual comparison of the goal post model and the loss function, use the interactive model from GeoGebra.
 
Other Loss Function Formulations
     There are two special cases that warrant particular attention:  Smaller is Better and Bigger is Better.  When lower characteristic values are desirable – i.e. zero is ideal – the loss function can be simplified by setting t = 0.  The resulting smaller is better formulation is L(x) = kx^2, where k is calculated using the value of x at which a product would be scrapped or a process ceased.  Example characteristics that utilize this formulation include noise levels, pollutant emissions, and response time of a system.  The smaller is better loss function is shown conceptually in Exhibit 3.
     When larger characteristic values provide greater performance or customer value, the bigger is better formulation of the loss function is used, in which the inverse of the characteristic value is used to calculate losses.  It takes the form L(x) = k (1/x)^2.  At x = 0, the loss would be infinite – an unrealistic result.  More likely, there is a minimum anticipated value of the characteristic; this value should be used to calculate k.  This value also defines the maximum expected loss per unit.  The bigger is better loss function is shown conceptually in Exhibit 4.
     This formulation also suggests that zero loss cannot be achieved.  Doing so requires the characteristic to reach an infinite value – another unrealistic result.  In practical terms, there may be a value beyond which the relative benefits are imperceptible, resulting in an effectively zero loss.  This idiosyncrasy does not diminish its comparative value.
 
Relationship to Other Programs
     Six Sigma initiatives attempt to reduce variation and center process means in their distributions.  Though the objectives are often defined in goal post terminology, Six Sigma is highly compatible with the paradigm of the loss function.  Both pursue consistent output, though they value that output differently.
     The Taguchi Loss Function fosters a continuous improvement mindset.  Until all losses are zero, there are potential improvements to be made.  The loss function formulations presented provide a method to determine the cost-effectiveness of proposed improvement projects.  First, a baseline is established; then anticipated gains (reduced losses) can be calculated.  If the anticipated gains exceed planned expenditures, the project is justified.
     Project objectives may even be defined by a loss function.  That is, a target loss value may be defined; then solutions are sought to achieve the target level.  This application is not in common use; the difficulty in achieving the necessary paradigm shift ensures it.
 
Summary
     A mindset adjustment is required to transition from traditional (goal post) quality evaluations to Taguchi Loss Functions.  If this is achieved, they can be used to effect by following a simple procedure:
  • Define the reference loss (e.g. scrap cost).
  • Define the value of the characteristic at which this loss is incurred.
  • Calculate the loss coefficient, k.
  • Calculate the loss incurred by each deviation from target values.
     At this point, the process branches according to the objectives sought.  Totals and averages can be calculated, comparisons made, or other analyses conducted.  The loss function provides a useful reference and encourages an expanded view of perceived quality.
 
     For assistance introducing loss functions to your organization or pursuing other operational improvement efforts, contact JayWink Solutions for a consultation.

     For a directory of “The War on Error” volumes on “The Third Degree,” see “Vol. I:  Welcome to the Army.”
 
References
[Link] “Taguchi loss function.”  Wikipedia.
[Link] “Taguchi Loss Function.”  WhatIsSixSigma.net
[Link] “Taguchi Loss Function.”  Lean Six Sigma Definition.
[Link] “Taguchi loss function.”  Six Sigma Ninja, November, 11, 2019.
[Link] “The Taguchi loss function.”  Thomas Lofthouse; Work Study, November 1, 1999.
[Link] “Taguchi’s Loss Function.”  Elsmar.com.
[Link] “Principles of Robust Design.”  Dr. Nicolo Belavendram; International Conference on Industrial Engineering and Operations Management, July 2012.
[Link] “Robust Design Seminar Report.”  Shyam Mohan; November 2002.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Making Decisions – Vol. VII:  Perils and Predicaments]]>Wed, 18 Nov 2020 15:30:00 GMThttp://jaywinksolutions.com/thethirddegree/making-decisions-vol-vii-perils-and-predicaments     Regardless of the decision-making model used, or how competent and conscientious a decision-maker is, making decisions involves risk.  Some risks are associated with the individual or group making the decision.  Others relate to the information used to make the decision.  Still others are related to the way that this information is employed in the decision-making process.
     Often, the realization of some risks increases the probability of realizing others; they are deeply intertwined.  Fortunately, awareness of these risks and their interplay is often sufficient to mitigate them.  To this end, several decision-making perils and predicaments are discussed below.
     Group decision-making can become quite time-consuming.  The iterative, and sometimes combative, nature of group processes can cause a conclusion to seem rather elusive.  This risk can be reduced through careful consideration of the group’s membership prior to any discussion or debate (see “Who should be in the decision-making group?” in Vol. IV).
     To reduce the time required to reach a decision, a unilateral decision may be made (see “Who should make the final decision?” in Vol. IV).  This decision rule, however, is fraught with risk.  The type of person that would insist on making a unilateral decision is the same type that often suffers from overconfidence.  Overconfidence can be a byproduct of pure arrogance, but could also have less loathsome origins.  Underdeveloped judgment, or limited context-specific experience may cause a decision-maker to miss important cues or underestimate the severity of a situation.
     Another source of overconfidence is unreliable or inappropriate heuristics.  Prior satisfactory results using such methods could be merely coincidental, but lead to the application of heuristics to an ever-wider range of situations.  Once their limit of applicability – the range often being quite narrow – has been exceeded, heuristics can become dangerous, particularly when they build confidence without building competence.
     Whatever the source of overconfidence, it may lead a decision-maker to accept incomplete, or even suspect, information and limit analysis of the information available.  The overconfident decision-maker believes s/he can overcome these limitations with his/her superior wisdom and insight.  Unfortunately, this is rarely true.
     If a leader chooses not to make a decision, s/he may delegate the responsibility to a subordinate.  Delegating carelessly could have serious consequences.  The delegate could suffer from inexperience and overconfidence, could be ethically challenged or unduly influenced by politics (office or otherwise), “celebrity,” or other irrelevant factor.  Delegate responsibly!  (See “Of Delegating and Dumping”).
 
     The framing of a decision is a source of significant risk.  A decision definition that cites problems and risks typically prompts a very different response than one described in terms of challenges and opportunities.  Framing that influences a decision may be inadvertent, but may also be a deliberate attempt to manipulate decision-makers.  Those that present a situation to decision-makers may be able to predict their responses based on known affinities, biases, or obligations.  Using this insight, the subordinate can prime decision-makers to act in the manner the subordinate finds most favorable by framing the decision in such a way that triggers a bias, creates the impression of a potential violation of an obligation, or otherwise guides their thinking.  This is a tactic used to “tip the scales” toward the desired outcome while making the ostensible decision-maker an unwitting accomplice.
     A decision-maker may anchor on a solution early in the process.  It may be the first idea s/he heard, or the most glamorous, high-tech solution available.  It could be the solution that requires his/her particular expertise and is, therefore, the most familiar or comfortable.  It could also be the solution that s/he was primed to choose by the presentation of the problem and solution options.  Anchoring often leads to confirmation bias, where the decision-maker accepts only information that is confirmatory of the foregone conclusion.  If the decision-maker cannot be dislodged from an anchor, it usually results in a suboptimal, or satisficing, solution.
     Employing a decision-making method such as the Analytic Hierarchy Process (AHP) (see Vol. III), can facilitate an anchor-free decision.  The pairwise comparisons used in AHP are difficult to manipulate in order to reach a predetermined outcome.  As decision complexity, or number of criteria, increases, the difficulty of manipulation increases rapidly.
     Failure to recognize that “do nothing” is a valid option to be considered in many scenarios can place a satisfactory outcome in jeopardy.  As mentioned in Vol. I, there may be no alternative under consideration that will result in an improvement relative to the status quo.  The decision to implement something could result in the waste of significant resources – time, energy, and money.  Incurring these opportunity costs may preclude the pursuit of advantageous projects in favor of misguided endeavors, or “optics.”
 
     Even after a decision is made, perils remain.  Hindsight bias manifests in similar fashion to confirmation bias.  While confirmation bias causes the selective acceptance of information concurrent with the decision-making process, hindsight bias causes selective acceptance of historical information to justify a past decision.  This usually occurs when an investigation of the causes of unsatisfactory results is initiated; the decision-maker wants to defend his/her decision as “correct” despite a disappointing outcome.
     Defense of past decisions – to “save face” or other reasons – can lead to an escalation of commitment, where future decisions are influenced by those made in the past rather than objective analysis.  This is closely related to the sunk costs fallacy, where continued commitment is justified – irrationally – by past expenditures.  Both assume that conditions will change in such a way that will turn a failing endeavor into a success, or that it can be turned around if only it receives more investment.  It is important to learn from “bad” decisions, cut your losses, and move on.
 
     If you’d like to add to this list of decision-making perils and predicaments, feel free to leave a comment below.  Personal insights that help the community learn and grow are always welcome.
 
     For a directory of “Making Decisions” volumes on “The Third Degree,” see “Vol. I:  Introduction and Terminology.”
 
References
[Link] “An Overview of Decision-Making Models.”  Hanh Vu, ToughNickel, February 23, 2019.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[Performance Continuity via Effective Pass-Down]]>Wed, 04 Nov 2020 15:30:00 GMThttp://jaywinksolutions.com/thethirddegree/performance-continuity-via-effective-pass-down     Myriad tools have been developed to aid collaboration of team members that are geographically separated.  Temporally separated teams receive much less attention, despite this type of collaboration being paramount for success in many operations.
     To achieve performance continuity in multi-shift operations, an effective pass-down process is required.  Software is available to facilitate pass-down, but is not required for an effective process.  The lowest-tech tools are often the best choices.  A structured approach is the key to success – one that encourages participation, organization, and consistent execution.
Participation
     The topic of participation in pass-down encompasses three elements that must be defined and monitored:  the groups that need to participate, whom will represent that group in pass-down activities, and the types of activity required.
     There are various groups present in any organization.  Some are common participants in pass-down activities; others are often overlooked in this process.  Several are presented below, with examples of topics that each may discuss during pass-down.

Operations:  The operations group should share any information related to their ability to meet production expectations.
  • productivity (production rate), causes of reductions or interruptions
  • absenteeism, staffing changes, and other personnel issues
  • special orders, trials, material availability
  • production schedule and any changes
  • any unusual occurrences, process instability, or process deviations “work-arounds”)
  • incoming material quality concerns
Maintenance:  The maintenance team should discuss any significant occurrences and ongoing situations.
  • conditions to be monitored
  • temporary repairs in place
  • projects or repairs in progress
  • PM/PdM status (complete/overdue)
  • downtime report – what occurred during the shift, what equipment remains down, and what is required to return it to service
  • any mater that requires priority
Engineering:  The engineering department should discuss the status and requirements of any ongoing investigations or projects under its direct supervision.
  • equipment installation or upgrade
  • prototyping in progress
  • trial runs or designed experiments
  • data collection
  • conditions or occurrences under investigation
Quality:  The quality assurance group should share information about any non-standard conditions or processes.
  • active quality alerts
  • calibration status (complete/overdue)
  • active deviations
  • process validations or other layouts in progress
  • material sorting in progress (incoming, WIP, F/G)
  • rework in progress
Logistics/Materials:  The materials management group should ensure awareness of any activity that could affect the other groups’ abilities to perform planned activities.
  • material shortages
  • urgent deliveries being tracked
  • inventory/cycle count accuracy (shrinkage, unreported scrap, etc.)
Safety, Health, Environment:  The SHE team should discuss any unresolved issues and recent occurrences.
  • injuries, spills, and exposures
  • accidents and near-misses
  • ergonomic evaluations in progress or needed
  • drills to be conducted
  • system certifications needed
  • training requirements
Security:  The security team should share information about any non-routine activities that require their attention.
  • illegal or suspicious activity in or near the facility
  • contractor access required (e.g. special areas of the facility)
  • security system malfunctions (e.g. camera, lighting, badge access) and response plan

     Depending on the size of the organization and the specific activities in which it engages, the groups mentioned may or may not be separate entities.  The example topics, however, remain valid regardless of the roles or titles of the individuals participating in the information exchange.  Likewise, there is significant overlap in the information needs of these groups.  For example, engineering and quality routinely work together to design and conduct experiments and collect data.  The maintenance group may make adjustments to a process to support an experiment, while the operations group runs the equipment.  If a third-party contractor is needed, notifying security in advance could accelerate the contractor’s access to the facility, keeping the experiment on schedule.
Each group conducting a pass-down may have a single representative that compiles and exchanges information; there could also be multiple members of any group participating.  The appropriate representation depends on the number of topics to be discussed and each member’s knowledge of them.  In general, additional participants are preferable to incomplete information.
     The activity of each participant is prescribed by the process chosen or required by circumstances.  In-person discussions are often the most effective; each participant has the opportunity to ask questions, get clarifications, and discuss alternatives.  Face-to-face meetings should be supported by documentation for future reference.  Also, when in-person discussions cannot be held (i.e. non-overlapping shifts), the documentation, on its own, serves as the pass-down.  For this reason, providing coherent written pass-downs is a critical habit that all team members must form.
     The form of pass-down used defines the general activity requirements.  The example topics, provided above for several groups, allude to some specific activities that may be required of participants.  Further specific tasks are defined by the way in which pass-down information is organized, collected, and shared.  This is our next topic.
 
Organization
     There are multiple facets to the organization of an effective pass-down process.  The organization of people was discussed in “Participation,” above.  Next is the organization of information on the written pass-down reports.  Formatted report templates are useful for this purpose.  A template ensures that no category of relevant information is overlooked; each has a designated space for recording.  Omission of any information type deemed necessary will be immediately obvious; the writer can be prompted to complete the form before details fade from memory.
     As mentioned in the introduction, software can be used to facilitate this process, but the form need not be digital – at least not at first.  Hand-written forms should be scanned and archived for use when an historical record is needed.  A whiteboard could be used for short-term notes; the board can be erased and reused once the information has been recorded in the permanent pass-down record.
     Whatever “tools” your organization chooses to use, keep the pass-down process as straightforward as possible.  The simpler to understand and the easier to use the better.  Less required training and more understanding result in wider use and greater value to the organization.
     When overlapping shifts permit in-person communication, the pass-down should be treated as any other meeting – an agenda, a schedule, and required attendance should be defined.  See “Meetings:  Now Available in Productive Format!” for additional guidance.
     Pass-down may be conducted by each department or function separately; however, overlapping information needs behoove an organization to consider alternative attendance schemes.  Conducting pass-down per production line, or value stream, is a common example.  All of the groups that support operations on that production line – maintenance, quality, engineering, logistics, etc. – share information with all others simultaneously.  This is an efficient method to ensure that the information required to effectively support operations has been provided to those who need it.
 
Consistency
     The greater the number of groups conducting pass-down activities within an organization, the more difficult it is to execute consistently.  Some elements of consistent execution have been discussed, such as a preformatted report, meeting agenda, and required attendance.  To ensure that consistency, once attained, is maintained, higher-level managers should audit – or attend regularly – as many pass-down meetings as possible.  Other members should also audit meetings they do not regularly attend.  Auditing other meetings facilitates the comparison of differing techniques, or styles, in pursuit of best practices.  All groups should be trained to use the best practices, restoring consistent execution.
     Pass-down meetings are also opportunities for “micro-training” or reminders on important topics.  Examples include safety tips (e.g. proper use of PPE, slip/fall prevention), reminders to maintain 5S and to be on the lookout for “lean wastes.”  This time could also be used for a brief description of a new policy, review of financial performance, or to introduce new team members.
     Micro-training may not be an obvious component of pass-down procedures, but it is helpful in encouraging consistency in many practices.  It is an excellent opportunity to highlight safety, teamwork, and other topics that are important to the group.  Achieving consistency in other activities, in turn, reinforces consistent execution of pass-down activities.
 
     Effective pass-down is an essential practice that is often neglected and allowed to disintegrate.  It does not capture headlines and it’s not glamorous; celebrities do not make public service announcements about it.  It does not require high-tech tools or billionaire investors to bring it to fruition.  Too often, the latest technology, social media platform, or other high-profile whiz-bang distracts leaders and managers from the things that are truly relevant to the performance of an organization.  Solid fundamentals, fanatically cultivated and relentlessly developed, pave the way to organizational success; losing sight of them can be disastrous.  Effective pass-down is one of these fundamentals.
 

     For assistance with establishing a robust pass-down process, repairing a broken, neglected system, or solidifying other fundamental practices, contact JayWink Solutions for a consultation.
 
References
[Link] “A Critical Shift: How Adding Structure Can Make Shift Handovers More Effective.”  Tom Plocher, Jason Laberge, and Brian Thompson; Automation.com, March 2012.
[Link] “Pass the shift baton effectively.”  Paul Borders; Plant Services, April 18, 2017.
[Link] “7 Strategies for Successful Shift Interchange.”  Maun-Lemke, LLC, 2006.
[Link] “Reducing error and influencing behavior.”  UK Health and Safety Executive, 1999.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[It Starts in the Parking Lot]]>Wed, 21 Oct 2020 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/it-starts-in-the-parking-lot     A person’s first interaction with a business is often his/her experience in its parking lot.  Unless an imposing edifice dominates the landscape, to be seen from afar, a person’s first impression of what it will be like to interface with a business is likely formed upon entering the parking lot.  It is during this introduction to the facility and company that many expectations are formed. “It” starts in the parking lot.  “It” is customer satisfaction.
     Retail-business owners and managers are often aware of concerns about parking accommodations, but most give it little thought and exceedingly few make it a priority.  Perhaps their awareness does not extend to the impact it could have on the bottom line.  The parking lot experience is a potential differentiator for competitive brick-and-mortar businesses and is critical when in competition with online retailers.  If “buy local” campaigns are to succeed, visits to hometown retailers must be convenient and pleasant.
     These experiences are often controlled by property managers instead of managers of the businesses served by the parking infrastructure.  Whether you own, manage, or lease the real property, there are a number of factors to be considered when building, selecting, or improving the physical space that serves one or more retail businesses.
  • The number of entrances and exits should ensure sufficient traffic flow for unencumbered access to the business.
  • Entrances and exits should be located such that adjacent traffic patterns do not hinder entry and departure of the property.
  • Aisles should be of sufficient width to allow two-way traffic flow.
  • Spaces should be of sufficient width to allow vehicle doors to be opened widely to facilitate vehicle egress and ingress without damaging adjacent vehicles.
  • Spaces should be of sufficient length to accommodate modern vehicles without protruding into the aisles.
  • There should be a sufficient number of spaces available to accommodate the anticipated peak volume of customers plus employees’ vehicles.  Whenever possible, this number should be increased to accommodate forecast error, business growth, contractors or other visitors, etc.
  • Frivolous structures that force parking spaces to be further from building entrances than necessary should be avoided.
  • Sufficient lighting should be provided to create a safe environment for patrons visiting in low-light hours.
  • Safety should be enhanced by avoiding obstructed views to the parking lot from any building or other area of the lot.  Unnecessary structures, recessed areas, tall bushes, and other places in which miscreants can hide should be avoided.
  • Provisions for managing the effects of inclement weather should also be in place.  Effective drainage is needed to prevent puddles from forming; snow and ice removal must also be accommodated if the climate calls for it.
  • The area should be kept free of trash, debris, or other refuse.
  • Repairs should be made promptly to potholes, curbing, signage, lighting, and so on.  Doing so can prevent further degradation, vehicle damage, customer mishaps that may lead to personal injury, or other liabilities.
 
     There may be additional considerations that require the attention of service providers, depending on the nature of the business.  Automotive services, in particular, must thoroughly consider their parking infrastructure.  In addition to providing ample space for customers, the responsibility to access and protect vehicles from damage extends to employees.  This type of business may also require additional parking in order to accommodate vehicles that require an extended storage period in addition to current and prospective customers’ vehicles.  Reviewing the nature of traffic flow through your business could pay dividends in efficiency and customer satisfaction.
 
     The factors outlined for retail businesses also apply to manufacturing facilities.  However, the magnitude, or priority, of some may differ due to the nature of manufacturing operations.  Because customer satisfaction begins with employee satisfaction, it remains an important topic.  Also, being closed to the public, more or less, complicates the matter somewhat.
     An example of a change in magnitude is the number of entrances and exits needed to ensure sufficient traffic flow.  Retail and service businesses typically experience variable, dispersed traffic flows.  Peak periods may be predictable, however, such as Friday afternoon at a bank, and can be managed in various ways.  Peak traffic flows at manufacturing facilities are highly predictable, with few options for mitigation.  Increasing the number of entrances and exits will reduce congestion at the beginning and end of each shift.
     Related to peak traffic flow, the number of spaces available is a common failure among many manufacturing facilities.  Scheduling overlapping shifts doubles the required number of spaces, yet few facilities provide sufficient parking to effectively support the chosen shift model.  This leads to parking in unauthorized places, such as spaces designated for visitors or handicapped parking, pedestrian walkways, or any space large enough to fit a vehicle.  Employees often return at their first opportunity to remove their vehicles from unauthorized spaces.  Both actions place the employee at risk of disciplinary action by the employer that created the problem!
     Security measures in place at many manufacturing facilities violate the “rules” or proper parking area design.  Specifically, parking areas are often located at an excessive distance from facility entrances.  Entry is often further impeded by sensitive landscaping, fences, or other mostly unnecessary structures that create circuitous paths that serve only to retard access by essential personnel while providing little, if any, legitimate security.
 
     Online businesses must also monitor their parking lots, though they are virtual.  A landing page should be inviting; the entire site should be easy to navigate and quick to load.  It should be easy to complete the desired transactions and clear when they are complete.  Online businesses may have an advantage over their real-world counterparts – no driving necessary –  but an unpleasant parking lot experience can be just as damaging in the virtual world as in the physical.
 
     The importance of parking infrastructure is routinely overlooked.  “It’s just the open space outside the building” is an unfortunately common mindset.  Careful consideration of design decisions will reveal potential impacts to customers and employees.  As early as possible, the detrimental effects should be eliminated and advantageous features accentuated.  Accessibility and required security protocols should be simple enough for all comers to understand.  It is often “the little things” that tip the scale in your favor when trying to attract customers and talented employees.  There are many components of customer satisfaction, but it always starts in the parking lot.
 
     Contact JayWink Solutions to discuss improvements in the customer experience at your business or other operations-related needs.
 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[The War on Error – Vol. VIII:  Precontrol]]>Wed, 07 Oct 2020 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/the-war-on-error-vol-viii-precontrol     There is some disagreement among quality professionals whether or not precontrol is a form of statistical process control (SPC).  Like many tools prescribed by the Shainin System, precontrol’s statistical sophistication is disguised by its simplicity.  The attitude of many seems to be that if it isn’t difficult or complex, it must not be rigorous.
     Despite its simplicity, precontrol provides an effective means of process monitoring with several advantages (compared to control charting), including:
  • It is intended for use on the shop floor for rapid feedback and correction.
  • Process performance can be evaluated with far fewer production units.
  • No calculations are required to perform acceptance evaluations.
  • No charts are required.
  • It is not based on an assumption that data is normally distributed.
  • It is straightforward, based on specification limits.
  • It uses a simple, compact set of decision rules.
     This installment of “The War on Error” explains the use of precontrol – how process monitoring zones are established, the decision rules that guide responses to sample measurements, and the fundamental requirements of implementation.  Some potential modifications will also be introduced.
Preparations
     Successful implementation of precontrol begins with an evaluation of the process to be monitored to ensure that it is a suitable application.  Information about existing processes should be readily available for this purpose.  New processes should be carefully considered, with comparisons to similar operations, to determine suitability.  Precontrol is best-suited to processes with high capability (i.e. low variability) and stability (i.e. output drifts slowly).
     Operators’ process knowledge is critical to the success of precontrol.  They must understand the input-output relationship being monitored and be capable of making appropriate adjustments when needed.  Otherwise, their interventions are merely process “tampering” that results in higher variability and lower overall performance.  Reliable measurement systems (see Vol. IV, Vol. V) are required to support effective process control by operators without excessive intervention.
 
Process Monitoring Zones
     Process monitoring zones, or precontrol zones, are based on the tolerance range of the process output, with the target value centered between the upper and lower specification limits (USL, LSL) of a bilateral (two-sided) tolerance.  The relationship of this tolerance range to input variables should be known in order to make effective process adjustments when needed.  Tolerance parallelograms, or other techniques, can be used for this purpose.
     To define the precontrol zones for a bilateral tolerance, divide the tolerance range into four equal segments.  The two segments that flank the target value are combined to create the green zone; measurements that fall in this zone are acceptable.  In normally-distributed data, this 50% of tolerance encompasses approximately 86% of process output.
The remaining segments, each containing 25% of the tolerance, are the yellow zones.  The yellow zones are often called warning zones, because measurements that fall in these zones may indicate that the process has drifted and requires adjustment.  In normally-distributed data, approximately 7% of process output will fall in each of the yellow zones.  These estimates assume that the width of the normal distribution matches the tolerance range and its mean equals the target value.
The precontrol red zones encompass all values outside the specification limits.  A graphical representation of bilateral tolerance precontrol zones is shown in Exhibit 1.
     There are three possible precontrol zone configurations for unilateral (one-sided) tolerances.  The first, called “zero is best,” simply divides the tolerance range by two.  The green zone encompasses the 50% of tolerance nearest zero (lower half) and a single yellow zone encompasses the remaining tolerance, up to the USL (upper half).  A single red zone encompasses all values above the USL.  This configuration is used for measurements that cannot produce negative values, such as surface roughness or yield loss.  “Zero is best” precontrol zones are shown graphically in Exhibit 2.
     The remaining two configurations are, essentially, mirror images of each other.  In one case, the LSL is defined, with no upper bound specified (“more is better”); the other defines the USL, while no lower bound is specified (“less is better”).  For each case, the tolerance range used to define precontrol zones is the difference between the specification limit (LSL or USL) and the best output that can be expected from the process (highest or lowest).  A single yellow zone encompasses 25% of the “tolerance” nearest the specification limit.
     The green zone includes the remaining 75% of “tolerance” and beyond.  Any measurements beyond the “best case” value should be investigated.  The expectations of the process may require adjustment, leading to recalculation of the precontrol limits.  It could also lead to the discovery of a measurement system failure.  Precontrol limits should also be reviewed as part of any process improvement project.
     The red zone in each case encompasses all values beyond the specification limit (y < LSL or y > USL).  “More is better” and “less is better” configurations of unilateral tolerance precontrol are presented graphically in Exhibit 3.
     Defined precontrol zones establish the framework within which process performance is evaluated.  The remaining component defines how to conduct such evaluations via decision rules that guide setup validation or qualification, run-time evaluations, and sampling frequency.  These decision rule sets are presented in the following section.
 
Decision Rules
     The first set of decision rules define the setup qualification process.  To approve a process for release to production, the measurements of five consecutive units must be in the green zone.  The setup qualification guidelines are as follows:
  • If five consecutive measurements are in the green zone, release process to production.
  • If one measurement is in a yellow zone, reset green count to zero.
  • If two consecutive measurements are yellow, adjust the process and reset green count to zero.
  • If one measurement is in a red zone, adjust the process and reset green count to zero.
     Repeat measurements until five consecutive measurements are in the green zone.  If significantly more than five measurements are regularly required to release a process, an investigation and improvement project may be warranted.
     Once the process has been released to production, a new set of decision rules are followed.  For run-time evaluations, periodic samples of two consecutive units are measured.  Response to the sample measurement results is according to the following guidelines:
  • If both measurements are green, continue production.
  • If one measurement is green and the other yellow, continue production.
  • If both measurements are in the same yellow zone, stop production and adjust the process.
  • If one measurement is in each yellow zone of a bilateral tolerance, stop production and investigate the cause of the excessive variation.  Eliminate or control the cause and recenter the process.
  • If either measurement is in a red zone, stop production and adjust the process.
     After any production stoppage or adjustment, return to the setup qualification guidelines to approve the process for release to production.  All units produced since the last accepted sample should be quarantined until a Material Review Board (MRB), or other authority, has evaluated the risk of defective material being present.  This evaluation may result in a decision to scrap, sort, or ship the quarantined material.
     The final decision rule defines the sampling frequency or interval.  The interval between samples can be defined in terms of time or quantity of output.  The target sampling interval is one-sixth the interval between process adjustments, on average.  Stated another way, the goal is to sample six times between required adjustments.  For example, a process that, on average, produces 60 units per hour and runs for three hours between required adjustments should be sampled once every 30 minutes or once per 30 units of production.  The sampling frequency may change over time as a result of learning curve effects, improvement projects, changes in equipment reliability, or other factors that influence process performance.
 
     The simplicity of precontrol, demonstrated by the division of the tolerance range into precontrol zones and easily-applied decision rules, makes it an attractive tool for implementation in production departments.  Administration of such a tool by those responsible for production maximizes its utility; it takes advantage of the expertise of process managers and operators and eliminates delays in response to signs of trouble in the process.
 
Modifications to Precontrol
     The formulation presented above may be called “classical precontrol;” it serves as the baseline system to which modifications can be made.  F. E. Satterthwaite’s original formulation (1954) prescribed a green zone containing 48% of the tolerance range and yellow zones containing 26% of the tolerance range each.  The 50%/25% convention was adopted for ease of recall and calculation in an era preceding electronic aids.  If such aids are in use, the choice of zone sizes is nearly imperceptible in practice.  Either scheme can be chosen, but it should be used consistently throughout an organization to avoid confusion.
     Two-stage precontrol retains the precontrol zone definitions and the setup qualification and sampling frequency rules of classical precontrol, but expands the run-time evaluation rules.  In two-stage precontrol, responses to sample measurement results are in accordance with the following guidelines:
  • If both measurements are in the green zone, continue production.
  • If either measurement is in a red zone, stop production and adjust the process.
  • If either measurement is in a yellow zone, measure the next three units.
    • If three measurements in the expanded sample (i.e. 5 units) are green, continue production.
    • If three measurements in the expanded sample are yellow, stop production and adjust the process.
    • If any of the measurements in the expanded sample are red, stop production and adjust the process.
     Proponents of two-stage precontrol claim that the existence of a yellow-zone measurement in a two-unit sample is an ambiguous result.  Therefore, further sampling is required to determine the condition of the process.
     Modified precontrol is a hybrid of classical and two-stage precontrol and control charts.  The setup qualification and sampling frequency rules of classical precontrol are retained.  The run-time evaluation rules are the same as those used in two-stage precontrol.  Precontrol zone definitions are adapted from Shewhart’s control limits.  Green zone boundaries are defined by ±1.5σ (standard deviations of process performance), while yellow zones occupy the remaining tolerance range (±3σ).  This version negates one of the key advantages of classical precontrol – namely, no calculations required for evaluation – but the resulting sensitivity may be needed in some circumstances.  With increased sensitivity, however, comes a higher rate of false alarms (Type I error) that prompt adjustments that may be unnecessary.
     While other modification schemes exist, a thorough treatment is not the objective of this presentation.  If one of the formulations outlined above does not suit your needs, the presentation should suffice as an introduction to possible modifications.  To find a more suitable process control method, the cited references, or other sources, can be used for further research.
     As stated at the outset, charts are not required to implement any of the formulations of precontrol described above.  However, a precontrol chart can be a useful addition to the basic tool.  Charting provides historical data that can be applied to process improvement efforts or to detect excessively frequent adjustments, called tampering.  A precontrol chart can also provide an indication of operators’ effectiveness in making adjustments or the need for additional training.
     The final note to be made is less a modification than a recommendation.  The previous discussion of precontrol has been based on its application to process outputs.  While this is useful, the power of precontrol is maximized when it is applied to process inputs whenever practical.  This proactive approach can prevent high input variability from negatively effecting process output, reducing the number of samples, stoppages, and adjustments required to produce the demanded quantity of output.
 
 
     Despite its advantages, precontrol seems to incense some vocal advocates of SPC and control charting.  The criticisms of precontrol are not discussed her in detail for the following reasons:
  • A discussion of statistics beyond the scope of this presentation would be required.
  • Many of the criticisms are unfairly lodged, demonstrating the purveyors’ bias (or, more charitably, loyalty) toward their chosen process monitoring tool.
  • Precontrol is not presented as a substitute for SPC in all applications.
  • The criticisms are often more academic than pragmatic.
  • If precontrol helps your organization perform at the desired level, the criticisms are irrelevant.
Instead, modifications were presented to address some legitimate concerns with precontrol.  Other alternatives are also available; consult the references or other sources for more information.
     The decision to implement precontrol, in any configuration, or any of the alternatives, requires careful consideration of the process, operators, and customers.  Implementing an inappropriate or ineffective process control method can damage the credibility of all future efforts.  This may lead to process tampering or, on the opposite end of the spectrum, neglect.  The application must be monitored to ensure the system supports the ongoing effectiveness of operators in maintaining required quality and productivity levels.
 

     Contact JayWink Solutions for assistance in evaluating processes, establishing a precontrol system, training, or other process monitoring, control, and improvement needs.
 
     For a directory of “The War on Error” volumes on “The Third Degree,” see “Vol. I:  Welcome to the Army.”
 
References
[Link] “Strategies for Technical Problem Solving.”  Richard D. Shainin; Quality Engineering, 1993.
[Link] “An Overview of the Shainin SystemTM for Quality Improvement.”  Stefan H. Steiner, R. Jock MacKay, and John S. Ramberg;  Quality Engineering, 2008.
[Link] “Precontrol.”  Wikilean.
[Link] “Pre-Control: No Substitute for Statistical Process Control.”  Steven Wachs; WinSPC.com.
[Link] “The Power of PRE-Control.”  Hemant P. Urdhwareshe; Symphony Technologies.
[Link] “Pre-Control May be the Solution.”  Jim L. Smith; Quality Magazine, September 2, 2009.
[Link] “Using Control Charts or Pre-control Charts.”  Carl Berardinelli; iSixSigma.
[Link] “The theory of ‘Pre-Control’: a serious method or a colourful naivity?”  N. Logothetis; Total Quality Management, Vol 1, No 2, 1990.
[Link] “Precontrol.”  Beverly Daniels and Tim Cowie; IDEXX Laboratories, 2008.
[Link] “Shewhart Charts & Pre-Control:  Rivals or Teammates?”  Tripp Martin; ASQC Statistics Division Newsletter, Vol 13, No 3, 1992.
[Link] “Pre-control and Some Simple Alternatives.”  Stefan H. Steiner; Quality Engineering, 1997.
[Link] “Pre-control versus  and R Charting:  Continuous or Immediate Improvement?”  Dorian Shainin and Peter Shainin; Quality Engineering, 1989.
[Link] World Class Quality.  Keki R. Bhote; American Management Association, 1991.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[The War on Error – Vol. VII:  The Shainin System]]>Wed, 23 Sep 2020 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/the-war-on-error-vol-vii-the-shainin-system     Lesser known than Six Sigma, but no less valuable, the Shainin System is a structured program for problem solving, variation reduction, and quality improvement.  While there are similarities between these two systems, some key characteristics lie in stark contrast.
     This installment of “The War on Error” introduces the Shainin System, providing background information and a description of its structure.  Some common problem-solving tools will also be described.  Finally, a discussion of the relationship between the Shainin System and Six Sigma will be presented, allowing readers to evaluate the potential for implementation of each in their organizations.
Origins of the Shainin System
     Development of the Shainin System began in the 1940s while Dorian Shainin worked with aeronautics companies to resolve production issues.  In the intervening decades, the system has evolved, with additional tools developed, through Shainin’s experience with numerous operations and a vast range of quality and reliability problems.
     Use of the Shainin System to improve process performance is based upon the following tenets:
  • A single root cause is the source of the greatest amount of variation (the “Red X”); it may be an independent variable or the interaction effects of multiple variables.  The “Pink X” and “Pale Pink X” are the second and third largest contributors to variation, respectively.
  • Breakthrough improvements in process performance are achieved only by identifying and controlling the Red X.  The Pink X and Pale Pink X may also require improvement.  The Red X is critical because the output varies as a function of the squares of the input variations, according to the following relation:
  • Random variation does not exist.  Several input factors (Xs) affect the variation in output (Y).  With sufficient resources, all Xs could be identified.
     The Shainin System is also known by a more descriptive moniker, “Red X Problem Solving,” owing to its focus on the single largest contributor to process variation.  It has been developed to be statistically rigorous, while keeping complex statistics in the background.  This expands the use of tools from highly-trained statisticians to operators and engineers of all skill levels.
     Red X Problem Solving is best suited for process development projects with the following characteristics:
  • medium- to high-volume process,
  • data are readily available,
  • statistical methods are in common use, and
  • process intervention is difficult.
 
The FACTUAL Framework
     Much like Six Sigma, the Shainin System uses an acronym to guide application of its methodology.  Fortunately, this one is easy to pronounce!  The FACTUAL framework is comprised of the steps Focus, Approach, Converge, Test, Understand, Apply, and Leverage.
     Each component of the FACTUAL framework is presented in a summary table below.  The top cell of each table contains a description of activities associated with that step.  The lower left cell contains the deliverables expected to result from those activities.  The lower right cell lists examples of tools commonly used to achieve the objectives of each step.  Those designated with a “*” are discussed in the next section.
     The FACTUAL framework exhibits a similar flow to DMAIC – problem definition → data collection → analysis → implementation → monitoring – but the Shainin System takes much greater advantage of graphical analysis than does Six Sigma.  Use of graphical tools facilitates implementation by non-mathematicians and expedites identification of the Red X.  The Shainin System toolbox is the topic to which we now turn.
 
Tools of the Shainin System
     More than 20 tools have been developed as part of the Shainin System, using a foundation of rigorous statistics.  The tools are designed to be intuitive and easy to use by practitioners without extensive statistical training.  Many are unique, while a few have been adapted or expanded for use within the Shainin System.  The descriptions below only preview the tools available.  Consult the references cited, or other sources, for additional information.  Links to additional information are provided to assist this effort.
 
 [Focus]
5W2H:  5W2H is an abbreviation for seven generic questions that can be used to define the scope of a problem.  The 5 Ws are Who? What? Where? When? and Why?  The 2 Hs are How? and How much?  As simple as this “tool” is, it must be used with caution.  If the answers to these questions are perceived as accusatory, the improvement effort may lose the support of critical team members, hindering identification of the Red X and implementation of a solution.
Links to additional information on tools used in the Focus step:
Eight Disciplines (8D)
 
 [Approach]
Isoplot:  An Isoplot is used to evaluate the reliability of a measurement system; it is an alternative method to variable gauge R&R.  To construct and interpret an Isoplot (see Exhibit 1):
1) Measure the characteristic of interest on each of 30 pieces.  Repeat the measurement on each piece.
2) Plot the pairs of measurements; the first measurement on the horizontal axis, the second on the vertical axis.
3) Draw a 45° line (slope = 1) through the scatter plot created in step 2.
4) Draw two lines, each parallel to the 45° line, that bound all data points.
5) Determine the perpendicular distance between the lines drawn in step 4.  This distance represents the measurement system variation, ΔM.
6) Determine the process variation, ΔP.  This is the range of measurements along the horizontal axis.
7) Calculate the Discrimination Ratio:  ΔP/ΔM.
8) If (ΔP/ΔM) ≥ 6, accept the measurement system; otherwise, seek improvements before using the measurement system in the search for the Red X.
 [Converge]
Component Search:  This technique can be called, simply, “part swapping.”  By exchanging components in an assembly and tracking which combinations perform acceptably and which do not, the source of the problem can be traced to a single component or combination of components.  To conduct a Component Search, select a set of the best-performing assemblies (BoB, “best of the best”), and an equal number of the worst-performing (WoW, “worst of the worst”).  As parts are swapped, poor performance will follow the “problem parts.”
Multivari Analysis:  A multivari chart is a type of run chart that displays multiple measurements within samples over time.  For example, a characteristic is measured in multiple locations on each part.  Samples consist of consecutively-produced parts, collected at regular intervals.  As can be seen in Exhibit 2, plotting these data allows rapid evaluation of positional (within-part), cyclical (part-to-part), and temporal (time-to-time) variations.
Links to additional information on tools used in the Converge step:
Concentration Diagram
Group Comparison
Solution Tree – example shown in project context
 
 [Test]
B vs. C:  Comparing the current process (C) with a proposed “better” process (B) develops confidence that the new process will truly perform better than the existing one.  Data collected while running the B process should indicate improved performance compared to C.  When C is again operated, performance should return to the previous, unsatisfactory level.  If process performance cannot be predictably controlled in this fashion, further investigation is required.  Several iterations may be required to sufficiently demonstrate that the new process, B, outperforms the existing process, C.
Variable Search:  Conducted in similar fashion to Component Search, Variable Search is employed with five or more suspect variables.  “High” and “low” values are chosen for each variable, analogous to “good” and “bad” parts.  Comparisons of performance are then made, varying one suspect variable at a time, to incriminate or eliminate each variable.  If a characteristic is “the same” in “good” and “bad” assemblies, it is unlikely to be a significant contributor to the issue experienced.
Links to additional information on tools used in the Test step:
ANOVA on Ranks
Full Factorial Experiments or Full Factorial Example
 
 [Understand]
Tolerance Parallelogram:  Realistic tolerances to be used for the Red X variable to achieve Green Y output can be determined using a Realistic Tolerance Parallelogram Plot.  To construct and interpret a Tolerance Parallelogram (see Exhibit 3):
1) Plot Red X vs. Green Y for 30 random parts.
2) Draw a median line, or regression line, through the data.
3) Draw two lines, parallel to the median line, that bound “all but one and a half” of the data points.  The vertical distance between these two lines represents 95% of the output variation caused by factors other than the variable plotted.  If this distance is large, the variable may not be as significant as suspected.  Perhaps it is a Pink X or Pale Pink X instead of the Red X.
4) Draw horizontal lines at the upper and lower limits of the customer requirement for the output (Y).
5) From the intersection of the lower customer requirement line and the lower parallelogram boundary, draw a vertical line to the horizontal axis.
6) From the intersection of the upper customer requirement line and the upper parallelogram boundary, draw a vertical line to the horizontal axis.
7) The intercepts of the lines drawn in steps 5 and 6 with the horizontal axis represent the realistic tolerance limits for input X to achieve Green Y output.
Links to additional information on tools used in the Understand step:
Tolerance Analysis
 
 [Apply]
Lot Plot:  A Lot Plot can be used to determine if a lot of parts should be accepted, sorted, or investigated further.  To construct and interpret a Lot Plot:
1) Measure 50 pieces, in 10 subgroups of 5 pieces each, from a single lot.
2) Create a histogram, with 7 – 16 divisions, of all 50 measurements.
3) From the histogram, determine the type of distribution (normal, bimodal, non-normal).
4) Calculate the average (Xbar) and range (R) of measurements in each subgroup.
5) Calculate the average of the subgroup averages (Xdbl-bar) and the average of the subgroup ranges (Rbar).
6) Calculate upper and lower lot limits (ULL, LLL):
7) Evaluate acceptability of lot.  For normally-distributed data, use the following guidelines:
  • All sample data within specification limits → Accept.
  • ULL and LLL are within specification limits → Accept.
  • ULL or LLL is outside specification limits → Acceptance decision is made by Material Review Board.
For non-normally-distributed data, use the following guidelines:
  • ULL and LLL are within specification limits → Investigate non-normality or repeat sampling.
  • ULL and/or LLL is outside specification limits → Sort.
For bimodally-distributed data, investigate the cause of bimodality or repeat sampling.
Links to additional information on tools used in the Apply step:
Precontrol
 
 [Leverage]
To leverage what is learned from an improvement project, use of several tools may be repeated.  For example, similar processes may have a common Red X influencing their performance.  The tolerances required to maintain acceptable output from each, however, may be very different.  Each will require its own tolerance parallelogram.  In general, leveraging knowledge gained in one project does not preclude thorough analysis in another.
FMEA:  Though not a component of the Shainin System per se, a thoughtfully-constructed and well-maintained Failure Modes and Effects Analysis is a rich source of ideas for “leverage targets.”  For example, all products or processes that share a failure mode could potentially reap significant gains by replicating the improvements validated in a single project.
 
            The selection of tools at each stage of a project will depend on the specific situation.  A problem’s complexity (i.e. the number of variables involved) or the ability to disassemble and reassemble products without damage are examples of circumstances that will guide the use of tools.  Understanding the appropriate application of each is critical to project success.
 
The Shainin System and Six Sigma
            Vehement advocates of the Shainin System or Six Sigma often frame the relationship as an either/or proposition.  This adversarial stance is counterproductive.  Each framework provides a useful problem-solving structure and a powerful set of tools.  There is nothing to suggest that they are – or should be – mutually exclusive.
            Six Sigma is steeped in statistical analysis, while the Shainin System prefers to exploit empirical investigations.  Six Sigma traces a problem from input to output (X → Y), while the Shainin System employs a Y → X approach.  Six Sigma requires training specialists, while the Shainin System aims to put the tools to use on the shop floor.
            Despite their differences, these systems share common objectives, namely, process improvement and customer satisfaction.  In the shared objectives lies the potential for a cohesive, unified approach to problem-solving that capitalizes on the strengths of both frameworks.  One such unification proposes using Shainin tools within a DMAIC project structure (Exhibit 4).  Six Sigma tools could also be used to support a FACTUAL problem-solving effort.  Both frameworks are structured around diagnostic and remedial journeys, further supporting the view that they are complements rather than alternatives.
     The hierarchical structure of “Red X Problem Solvers” is another example of a similarity to Six Sigma that also highlights a contrast.  Students of the Shainin System begin as Red X Apprentices and advance to become Journeymen.  These titles have strong “blue-collar” connotations and are familiar to most “shop floor” personnel that are encouraged to learn and apply the tools.  Like Six Sigma, the Shainin System also has a designation for those with substantial experience that have been trained to coach others – the Red X Master.
 
 
     One need not have a sanctioned title to begin learning and applying useful tools.  Readers are encouraged to consult the references and links to learn more.  Also, JayWink Solutions is available for problem-solving assistance, training development and delivery, project selection assistance, and other operations-related needs.  Contact us for an assessment of our partnership potential.
 
For a directory of “The War on Error” volumes on “The Third Degree,” see “Vol. I:  Welcome to the Army.”
 
References
[Link] “Strategies for Technical Problem Solving.”  Richard D. Shainin; Quality Engineering, 1993.
[Link] “An Overview of the Shainin SystemTM for Quality Improvement.”  Stefan H. Steiner, R. Jock MacKay, and John S. Ramberg;  Quality Engineering, 2008.
[Link] “Introduction to the Shainin method.”  Wikilean.
[Link] “Shainin and Six Sigma.”  Shainin The Red X Company.
[Link] “Using Shainin DOE for Six Sigma: an Indian case study.”  Anupama Prashar; Production Planning & Control, 2016.
[Link] “Developing Effective Red X Problem Solvers.”  Richard D. Shainin; Shainin The Red X Company, April 18, 2018.
[Link] “Shainin Methodology:  An Alternative or an Effective Complement to Six Sigma?”  Jan Kosina; Quality Innovation Prosperity, 2015.
[Link] “The Role of Statistical Design of Experiments in Six Sigma:  Perspectives of a Practitioner.”  T. N. Goh; Quality Engineering, 2002.
[Link] World Class Quality.  Keki R. Bhote; American Management Association, 1991.
[Link] “Training for Shainin’s approach to experimental design using a catapult.”  Jiju Antony and Alfred Ho Yuen Cheng; Journal of European Industrial Training, 2003.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[The War on Error – Vol. VI:  Six Sigma]]>Wed, 09 Sep 2020 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/the-war-on-error-vol-vi-six-sigma     Despite the ubiquity of corporate Six Sigma programs and the intensity of their promotion, it is not uncommon for graduates to enter industry with little exposure and less understanding of their administration or purpose.  Universities that offer Six Sigma instruction often do so as a separate certificate, unintegrated with any degree program.  Students are often unaware of the availability or the value of such a certificate.
     Upon entering industry, the tutelage of an invested and effective mentor is far from guaranteed.  This can curtail entry-level employees’ ability to contribute to company objectives, or even to understand the conversations taking place around them.  Without a structured introduction, these employees may struggle to succeed in their new workplace, while responsibility for failure is misplaced.
     This installment of “The War on Error” aims to provide an introduction sufficient to facilitate entry into a Six Sigma environment.  May it also serve as a refresher for those seeking reentry after a career change or hiatus.
Six Sigma Defined
     “Six Sigma” became a term of art in the 1980s when Motorola used the statistical concept as the foundation for a company-wide quality goal and problem-solving strategy.  Popularity of the methodology began to skyrocket in the 1990s, thanks in large part to General Electric’s very public adoption and energetic promotion.  Widespread adoption followed, leading to Six Sigma becoming the de facto standard for quality management.  It became an actual standard in 2011, with the International Organization for Standardization’s (ISO) release of “ISO 13053: Quantitative Methods in Process Improvement – Six Sigma.”
     The foundation of a Six Sigma program, and the origin of its name, is process performance monitoring.  Process output is assumed to be normally distributed with the mean (μ, mu) centered between the specification limits.  The distance from the mean to either specification limit is given in standard deviations (σ, sigma), a measure of process variation.  The goal is to manage processes such that this distance is (at least) six standard deviations, or 6σ, as shown in Exhibit 1.  The total width of the distribution between specification limits (USL – LSL) is then 12σ, resulting in approximately two defects per billion opportunities.
     Understanding that no process is perfectly stable with its output mean centered between its specification limits, the developers of the Six Sigma methodology accounted for the difference between short- and long-term performance.  Empirical evidence indicated that process means could be expected to shift up to 1.5σ from center when long-term operational data is analyzed.  This results in a 4.5σ distance from the mean to one of the specification limits and a higher reject rate.  The shift essentially eliminates defects from the further specification limit, yielding an anticipated reject rate one-half that of a “4.5σ process,” or 3.4 defects per million opportunities (DPMO).  This is the target reject rate quoted by Six Sigma practitioners.  Exhibit 2 presents the mean shift and resultant reject rate graphically.
     The true goal of any process manager is to achieve zero defects, however unrealistic this may be.  Six Sigma process control seeks to come as close to this target as is economically and technologically feasible.  It engenders vastly more aggressive objectives than “traditional” process control that typically employs μ ± 3σ specification limits.
 
     The term “Six Sigma” is an umbrella covering two key methodologies, each with a unique application and purpose.  The DMAIC (“duh-MAY-ik”) methodology is used for process improvement; the most frequently employed methodology, users often use “Six Sigma” and “DMAIC” interchangeably.  The DMADV (“duh-MAD-vee”) methodology is used for product or process design.  A description of each, including the five-step process that forms its acronym and some example tools used during each, follows.
 
DMAIC – for Process Improvement
     Existing processes that underperform can be improved using the most common Six Sigma methodology, DMAIC.  The acronym is derived from the five steps that comprise this problem-solving process:  Define, Measure, Analyze, Improve, and Control.  Exhibit 3 presents a basic flowchart of the DMAIC process.  Note that the first three steps are iterative; measurement and analysis may reveal the need for redefinition of the problem situation.
     For brevity, each phase of DMAIC is presented in a summary table.  The top cell of each table contains a description of the phase activities and purpose.  The lower left cell contains typical outputs, or deliverables, of that phase.  These items will be reviewed during a “phase-gate” or similar style review and must be approved to obtain authorization to proceed to the next phase.  The lower right cell lists examples of tools commonly used during this phase.  Full descriptions of the tools will not be provided, however.  Readers should consult the references cited in this post, or other sources, for detailed information on the use of the tools mentioned, as well as other available tools.
     As shown in Exhibit 3, a review should be conducted at this point in the process to verify that the problem definition remains accurate and sufficient.  If adjustments are needed, return to the Define phase to restate the problem situation.  Modify, append, or repeat the Measure phase, as necessary, and analyze any new data collected.  Repeat this cycle until measurement and analysis support the problem definition.

     Though lessons-learned activity and replication are focused on processes other than that which was the subject of the Six Sigma project, they are included in the Control phase discussion for two key reasons:
1) They are vitally important to maximizing the return on investment by limiting the amount of redundant work required for additional processes to capitalize on the knowledge gained during the project.
2) They take place at the conclusion of a project; therefore, the Control phase discussion is the most appropriate to append with these activities.
     Some descriptions of the DMAIC methodology will include lessons learned and replication as additional steps or follow-up; others make no mention of these valuable activities.  To minimize confusion and encourage standardization of best practices, they are considered elements of the Control phase for purposes of our discussion.
 
DMADV – for Product or Process Design
     It will be evident in the following discussion that there are many parallels between DMAIC and DMADV.  The five steps that comprise DMADV are Define, Measure, Analyze, Design, and Validate.  The first three steps, it may be noted, have the same names as those in DMAIC, but their execution differs because each process has its own purpose and objectives.  Overall execution of DMADV, however, closely parallels DMAIC, as can be seen by comparing the flowchart of DMADV, presented in Exhibit 4, with that of DMAIC in Exhibit 3.
     The fundamental difference between DMAIC and DMADV is that DMADV is proactive while DMAIC is reactive.  Another way to think of this distinction is that DMAIC is concerned with problems, while DMADV is focused on opportunities.  Though other “acronymed” approaches to proactive analysis exist, DMADV is the predominant methodology.  For this reason, it is frequently used interchangeably with the umbrella term Design for Six Sigma (DFSS), as will be done here [DMADV doesn’t roll off the tongue quite so eloquently as DMAIC or DFSS (“dee-eff-ess-ess”)].
     The phases of DMADV are presented in summary tables below.  Like the DMAIC summaries, the lists are not exhaustive; additional information can be found in the references cited or other sources.
     Review results of analyses with respect to the opportunity definition.  If revisions are needed, return to the Define phase and iterate as necessary.
     Though an organization’s efforts are most effective when the inclination is toward proactive behavior, or preventive measures, DFSS is in much less common use than DMAIC.  The lingering bias toward reactive solutions is reflected in the greater quantity and quality of resources discussing DMAIC; DFSS is often treated as an afterthought, if it is mentioned at all.  This provides a significant opportunity for any organization willing to expend the effort to execute a more thorough development process prior to launch.  A proactive organization can ramp up and innovate, satisfying customers’ evolving demands, while reactive competitors struggle with problems that were designed into their operations.
 
Belts and Other Roles
     Perhaps the most visible aspect of Six Sigma programs is the use of a martial arts-inspired “belt” system.  Each color of belt is intended to signify a corresponding level of expertise in the use of Six Sigma tools for process improvement.  The four main belts in a Six Sigma program are Yellow, Green, Black, and Master Black.  Other colors are sometimes referenced, but their significance is not universally accepted; therefore, they are excluded from this discussion.  Responsibilities of the belted and other important roles are described below.
  • Yellow Belt (YB):  “Front-line” employees are often yellow belts, including production operators in manufacturing and service providers in direct contact with customers.  Yellow belts typically perform the operations being studied and collect the required data under the supervision of a green or black belt.  As project team members, yellow belts provide critical insight into the process under review, suggest relevant test conditions, and offer potential improvements in addition to their regular responsibilities.
  • Green Belt (GB):  Green belts lead projects, guiding yellow belts in improvement efforts; they are also team members for larger projects led by black belts.  Green belts are often process engineers and production supervisors, affording them knowledge of the process under review and responsibility for its performance; projects are undertaken in addition to day-to-day operational duties.  The varied role of a green belt may require data collection, analysis, problem solving and solution development, training, and more.
  • Black Belt (BB):  Black belts are responsible for delivering value to the organization through the execution of Six Sigma projects.  As such, they are typically dedicated to the Six Sigma program with no peripheral responsibilities.  A black belt acts as a project leader, coach and mentor to green belts, resource coordinator to facilitate cross-functional teamwork, and presenter to report progress and gain approval to proceed at phase-gate reviews.
  • Master Black Belt (MBB):  Master black belts provide support to the entire Six Sigma program, from training and mentoring black and green belts to overseeing company-wide multi-site improvement initiatives.  MBBs also provide support in the selection and assignment of projects that are appropriate for a black or green belt, assist in the use of advanced statistics or other tools, identify training needs, and deliver the required training.  Smaller organizations, or those with nascent programs, may rely on external resources to fill this role.
     Though the presentation of material will likely differ among certifying organizations, the definition of responsibilities and required abilities for each belt are mostly consistent.  Standard competency requirements for a number of skills are summarized in Exhibit 5.
     The belts in an organization are directly responsible for executing Six Sigma improvement projects. To be successful, they need the support of other essential roles in the program.  These are described below.
  • Project Sponsor:  A project sponsor supports improvement projects undertaken in his/her area of responsibility by providing the necessary resources and removing barriers to execution.  These responsibilities require a certain level of authority; the project sponsor is typically a “process owner,” such as a production manager.  The sponsor participates in all phase-gate reviews and provides approval for the project to proceed.  S/he monitors the project to ensure its timely completion and evaluates it for potential replication elsewhere in the organization.
  • Deployment Manager:  The deployment manager administers the Six Sigma program.  This includes managing the number of belts in the organization and coordinating their assignments with their functional managers.  The deployment manager is also responsible for any facility resources dedicated to the program.
  • Champion:  A Six Sigma champion is typically a high-ranking, influential member of the quality function in the organization.  The champion is the chief promoter of the Six Sigma initiative within the organization, establishing the deployment strategy.  The champion also defines and advocates for business objectives to be achieved with Six Sigma.
     Every member of an organization contributes to the success or failure of Six Sigma initiatives, whether or not they have been given one of the titles described above.  Each person has the ability to aid or hinder efforts made by others.  Effective communication throughout the organization is critical to the success of a new Six Sigma program.  Explaining the benefits to the organization and to individuals can turn skeptics into supporters.  The more advocates a program has, the greater its chance of success.
 
Additional Considerations
     There are three important caveats offered here.  The first is common in many contexts – launching a Six Sigma program does not ensure success. Put another way, desire does not guarantee ability.  A successful program requires the development of various disparate skills.  “Expert-level” skills in statistical analysis, for example, provides no indication of the ability to develop and implement creative and innovative solutions.
     Second, achieving six sigma performance has the potential to be a Pyrrhic victory.  That is, a misguided effort can be worse than no effort at all.  Analysis failures that lead to poorly-chosen objectives can divert resources from the most useful projects, causing financial performance to continue to decline while reports indicate improving process performance.  Many organizations have abandoned their Six Sigma programs as administration costs exceed the gains achieved.
     The third caveat is the “opposite side of the coin” from the first.  Any individual interested in improving process performance or product design need not delay for lack of a “belt.”  Certification does not guarantee success (caveat #1) and lack of certification does not suggest imminent failure.  This introductory post, other installments of “The Third Degree,” past and future, and various other resources can guide your improvement efforts and development journey.  No specialized, status-signaling attire is required.
 
     This installment of “The War on Error” series was written with two basic goals:
1) provide an introduction that will allow those without experience or formal training to understand and participate in conversations that take place in Six Sigma environments, and
2) provide a list of tools accessible to beginners to be used as an informal development plan.
Readers for which the first goal was achieved are encouraged to take full advantage of the second.  Your development is your responsibility; do not wait to be invited to the “belt club.”
 
     JayWink Solutions is available for training plan development and delivery, project selection and execution assistance, and general problem-solving.  Contact us for an assessment of how we can help your organization reach its goals.
 
     For a directory of “The War on Error” volumes on “The Third Degree,” see “Vol. I:  Welcome to the Army.”
 
References
[Link] “ISO 13053: Quantitative Methods in Process Improvement – Six Sigma – Part 1:  DMAIC Methodology.”  ISO, 2011.
[Link] “ISO 13053: Quantitative Methods in Process Improvement – Six Sigma – Part 2:  Tools and Techniques.”  ISO, 2011.
[Link] “Six Sigma,” “DMAIC,” and “Design for Six Sigma.”  Wikipedia.
[Link] “Integrating the Many Facets of Six Sigma.”  Jeroen de Mast; Quality Engineering, 2007.
[Link] “The Role of Statistical Design of Experiments in Six Sigma:  Perspectives of a Practitioner.”  T. N. Goh; Quality Engineering, 2002.
[Link] “Six Sigma Fundamentals:  DMAIC vs. DMADV.”  Six Sigma Daily, June 17, 2014.
[Link] “DMADV – Another SIX SIGMA Methodology.”  What is Six Sigma?
[Link] “Six Sigma Belts.”  Jesse Allred; Lean Challenge, February 18, 2019.
[Link] “Six Sigma Belts and Their Meaning.”  Tony Ferraro; 5S Today, August 22, 2013.
[Link] The Six Sigma Memory Jogger II.  Michael Brassard, Lynda Finn, Dana Ginn, Diane Ritter; GOAL/QPC, 2002.
[Link] The New Lean pocket Guide XL.  Don Tapping; MCS Media, Inc., 2006.
[Link] Creating Quality.  William J. Kolarik; McGraw-Hill, Inc., 1995.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[The War on Error – Vol. V:  Get Some R&R - Attributes]]>Wed, 26 Aug 2020 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/the-war-on-error-vol-v-get-some-rr-attributes     While Vol. IV focused on variable gauge performance, this installment of “The War on Error” presents the study of attribute gauges.  Requiring the judgment of human appraisers adds a layer of nuance to attribute assessment.  Although we refer to attribute gauges, assessment may be made exclusively by the human senses.  Thus, analysis of attribute gauges may be less intuitive or straightforward than that of their variable counterparts.
     Conducting attribute gauge studies is similar to variable gauge R&R studies.  The key difference is in data collection – rather than a continuum of numeric values, attributes are evaluated with respect to a small number of discrete categories.  Categorization can be as simple as pass/fail; it may also involve grading a feature relative to a “stepped” scale.  The scale could contain several gradations of color, transparency, or other visual characteristic.  It could also be graded according to subjective assessments of fit or other performance characteristic.
     Before the detailed discussion of attribute gauge studies begins, we should clarify why subjective assessments are used.  The most obvious reason is that no variable measurement method or apparatus exists to evaluate a feature of interest.  However, there are some variable measurements that are replaced by attribute gauges for convenience.  Variable gauges that are prohibitively expensive, operate at insufficient rates, or are otherwise impractical in a production setting, are often supplanted by less sophisticated attribute gauges.  Sophisticated equipment may be used to create assessment “masters” or to validate subjective assessments, while simple tools are used to maintain productivity.  Periodic verification ensures that quality is not sacrificed to achieve desired production volumes.
     Although direct calculations of attribute gauge repeatability and reproducibility may not be practical without the use of a software package, the assessments described below are analogous.  The “proxy” values calculated provide sufficient insight to develop confidence in the measurement system and to identify opportunities to improve its performance.
     Like variable gauge R&R studies, evaluation of attribute gauges can take various forms.  The presentation below is not comprehensive, but an introduction to common techniques.  Both types of analysis require similar groundwork to be effective; readers may want to review the “Preparation for a GRR Study” section in Vol. IV before continuing.

Attribute Short Study
     The “short method” of attribute gauge R&R study requires two appraisers to evaluate each of (typically) 30 parts twice.  This presentation uses a binary decision – “accept” or “reject” – to demonstrate the study method, though more categories could be used.  Exhibit 1 presents one possible format of a data collection and computation form.  Use of the form is described below, referencing the number bubbles in Exhibit 1.
1:  Each appraiser’s evaluations are recorded as they are completed, in random order, in the columns labeled “Trial 1” and “Trial 2.”  Simple notation, such as “A” for “accept” and “R” for “reject” is recommended to simplify data collection and keep the form neat and legible.
2:  Decisions for each part are compared to determine the consistency of each appraiser’s evaluations.  If an appraiser reached the same conclusion both times s/he evaluated a part, a “1” is entered in that appraiser’s consistency column in the row corresponding to that part.  If different conclusions were reached, a “0” is recorded.  Below the trial data entries, the number of consistent evaluation pairs is tallied.  The proportion of parts receiving consistent evaluations is then calculated and displayed as a percentage.
3:  The standard evaluation result is recorded for each part.  The standard can be determined via measurement equipment, “expert” evaluation, or other trusted method.  The standard must be unknown to the appraisers during the study.
4:  The first two “Agreement” columns record where each appraiser’s evaluation matches the standard (enter “1”).  A part’s evaluation can only be scored a “1” for agreeing with the standard if the appraiser reached the same conclusion in both trials.  Put another way, for A=Std = 1, A=A must equal 1; if A=A = 0, A=Std is automatically ”0.”  Column results are totaled and percentages calculated.
5:  Place a “1” in the “A/B” Agreement column for each part with consistent appraiser evaluations (A=A = 1 and B=B = 1) that match.  If either appraiser in inconsistent (A=A = 0 or B=B = 0) or the two appraisers’ evaluations do not match, enter “0” in this column.  Total the column and calculate the percentage of matching evaluations.
6:  The final column records the instances when both appraisers were consistent, in agreement with each other, and in agreement with the standard.  For each part that obtained these results, a “1” is entered in this column; all others receive a “0.”  Again, total the column results and calculate the percentage of parts that meet the criteria.

     The percentage values calculated in Exhibit 1 are proxy values; the objectives are inverted compared to variable gauge R&R studies.  That is, higher values are desirable in attribute gauge short studies.
     The consistency values (A=A, B=B) are analogous to repeatability in variable gauge studies.  Appraiser’s results could be combined to attain a single value to more-closely parallel a variable study; however, this is not a standard.  Caution must be exercised in its use; explicit explanation of its calculation and interpretation must accompany any citation to prevent confusion or misuse.
     A/B Agreement (A=B) is analogous to variable gauge reproducibility; it is an indication of how well-developed the measurement system is.  Better-developed attribute systems will produce more matching results among appraisers, just as is the case with variable systems.
     The composite value in the attribute study, analogous to variable system R&R, is Total Agreement (A=B=Std).  This value reflects the measurement system’s ability to consistently obtain the “correct” result over time when multiple appraisers are employed.
     While the calculations and interpretations are quite different from a variable R&R study, the insight gained from an attribute short study is quite similar.  The results will aid the identification of improvement opportunities, whether in appraiser training, refining instructions, clarifying acceptance standards, or equipment upgrades.  The attribute short study is an excellent starting point for evaluating systems that, historically, had not received sufficient attention to instill confidence in users and customers.

Effectiveness and Error Rates
     Perhaps even shorter than the short study described above, measurement system effectiveness can be calculated to provide shallow, but rapid, insight into system performance.  Measurement system effectiveness is defined as:
Effectiveness assessments are typically accompanied by calculations of miss rates and false alarm rates.  Although all of these values are defined in terms of a measurement system, they are calculated per appraiser.
     An appraiser’s miss rate represents his/her Type II error and is calculated as follows:
The false alarm rate represents an appraiser’s Type I error; it is calculated as follows:
     Appraisers’ performance on each metric are compared to threshold values – and to each other – to assess overall measurement system performance.  One set of thresholds used for this assessment is presented in Exhibit 2.  These guidelines are not statistically derived; they are empirical results.  As such, users may choose to modify these thresholds to suit their needs.
     Identification of improvement opportunities requires review of each appraiser’s results independently and in the aggregate.  For example, low effectiveness may indicate that an appraiser requires remedial training.  However, if all appraisers demonstrate low effectiveness, a problem more deeply-rooted in the measurement system is possibly the cause.  This type of discovery is only possible when both levels of review are conducted.  More sophisticated investigations may be required to identify specific issues and opportunities.

Cohen’s Kappa
     Cohen’s Kappa, often called simply “kappa,” is a measure of agreement between two appraisers of attributes.  It accounts for agreement due to chance to assess the true consistency of evaluations among appraisers.  A data summary table is presented in Exhibit 3 for the case of two appraisers, three categories, and one trial.  The notation used is as follows:
  • A1B2 = the number of parts that appraiser A placed in category 1 and appraiser B placed in category 2.
  • (A=B)1 = the number of parts that appraisers A and B agree belong in category 1.
Total agreement between appraisers, including that due to chance, is
where n = ∑Rows = ∑Cols = the total number of evaluation comparisons.  In order to subtract the agreement due to chance from total agreement, the agreement due to chance for each categorization is calculated and summed:
where Pi is the proportion of agreements and ci is the number of agreements due to chance in the ith category.  Therefore, ∑Pi is the proportion of agreements in the entire study due to chance, or pε.  Likewise, ∑ci is the total number of agreements in the entire study that are due to chance, or nε.
            To find kappa, use
where pa and na are the proportion and number, respectively, of appraiser agreements.  To validate the kappa calculation, confirm that κ ≤ pa ≤ 1.  Also, 0 ≤ κ ≤ 1 is a typical requirement.  A kappa value of 1 indicates “perfect” agreement, or reproducibility, between appraisers, while κ = 0 indicates no agreement whatsoever beyond that due to chance.  Discussion of negative values of kappa, allowed in some software, is beyond the scope of this presentation.
     Acceptance criteria within the 0 ≤ κ ≤ 1 range varies by source.  Minimum acceptability is typically placed in the 0.70 – 0.75 range, while κ > 0.90 is desirable.  If you prefer percentage notation, κ > 90% is your ultimate goal.  Irrespective of specific threshold values, a higher value of kappa indicates a more consistent measurement system.  Note, however, that Cohen’s Kappa makes no reference to standards; therefore, evaluation of a measurement system by this method is incomplete.

Analytic Method
     Earning its shorthand title of “long method” of attribute gauge R&R study, the analytic method uses a non-fixed number of samples, known reference values, probability plots, and statistical lookup tables.  A quintessential example of this technique’s application is the validation of an accept/reject apparatus (e.g. Go/NoGo plug gauge) used in production because it is faster and more robust than a more precise instrument (e.g. bore gauge).
     Data collection begins with the precise measurement of eight samples to obtain reference values for each.  Each part is then evaluated twenty times (m = 20) with the attribute gauge; the number of times each sample is accepted is recorded.  Results for the eight samples must meet the following criteria:
  • part with the smallest reference value accepted zero times (a = 0),
  • part with the largest reference value accepted twenty times (a = 20),
  • remaining parts accepted between one and nineteen times (1 ≤ a ≤ 19).
If the requirements are not met, the process should be repeated with additional samples until all criteria have been satisfied.  When adequate data have been collected, calculate the probability of acceptance for each part according to the following formula:
     From the probability calculations, a Gauge Performance Curve (GPC) is generated; the format shown in Exhibit 4 may be convenient for presentation.  However, the preferred option, for purposes of calculation, is to create the GPC on normal probability paper, as shown in Exhibit 5.  The eight (or more) data points are plotted and a best-fit line drawn through the data.  The reference values plotted in Exhibits 4 and 5 are deviations from nominal, an acceptable alternative to the actual measurement value.
     Measurement system bias can now be determined from the GPC as follows:
where Xt is the reference value at the prescribed probability of acceptance.
     Measurement system repeatability is calculated as follows:
where 1.08 is an adjustment factor used when m = 20.
     Significance of the bias is evaluated by calculating the t-statistic,
and comparing it to t0.025,df.  For this case, df = m -1 = 19 and t0.025,19 = 2.093, as found in the lookup table in Exhibit 6.  If t > 2.093, the measurement system exhibits significant bias; potential corrective actions should be considered.
     Like the previous methods described, the analytic method does not provide R&R results in the same way that a variable gauge study does.  It does, however, provide powerful insight into attribute gauge performance.  One advantage of the long study is the predictive ability of the GPC.  The best-fit line provides an estimate of the probability of acceptance of a part with any reference value in or near the expected range of variation.  From this, a risk profile can be generated, focusing improvement efforts on projects with the greatest expected value.

     Other methods of attribute gauge performance assessment are available, including variations and extensions of those presented here.  The techniques described are appropriate to new analysts, or for measurement systems that have been subject to no previous assessment, and can serve as stepping stones to more sophisticated investigations as experience is gained and “low-hanging fruit” is harvested.

     JayWink Solutions is available to assist you and your organization with quality and operational challenges.  Contact us for an independent review of your situation and action plan proposal.

     For a directory of “The War on Error” volumes on “The Third Degree,” see “Vol. I:  Welcome to the Army.”
 
References
[Link] “Measurement Systems Analysis,” 3ed.  Automotive Industry Action Group, 2002.
[Link] “Conducting a Gage R&R.”  Jorge G. Tavera Sainz; Six Sigma Forum Magazine, February 2013.
[Link] “Introduction to the Gage R & R.”  Wikilean.
[Link] “Attribute Gage R&R.”  Samuel E. Windsor; Six Sigma Forum Magazine, August 2003.
[Link] “Cohen's Kappa.”  Real Statistics, 2020.
[Link] “Ensuring R&R.”  Scott Force; Quality Progress, January 2020.
[Link] “Measurement system analysis with attribute data.”  Keith M. Bower; KeepingTAB #35 (Minitab Newsletter), February 2002.
[Link] Creating Quality.  William J. Kolarik; McGraw-Hill, Inc., 1995.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[The War on Error – Vol. IV:  Get Some R&R - Variables]]>Wed, 12 Aug 2020 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/the-war-on-error-vol-iv-get-some-rr-variables     While you may have been hoping for rest and relaxation, the title actually refers to Gauge R&R – repeatability and reproducibility.  Gauge R&R, or GRR, comprises a substantial share of the effort required by measurement system analysis.  Preparation and execution of a GRR study can be resource-intensive; taking shortcuts, however, is ill-advised.  The costs of accepting an unreliable measurement system are long-term and far in excess of the short-term inconvenience caused by a properly-conducted analysis.
     The focus here is the evaluation of variable gauges.  Prerequisites of a successful GRR study will be described and methodological alternatives will be defined.  Finally, interpretation of results and acceptance criteria will be discussed.
What is Repeatability and Reproducibility?
     To effectively conduct a measurement system analysis, one must first be clear on what comprises a measurement system.  Components of a measurement system include:
  • parts being measured,
  • appraisers (individuals doing the measuring),
  • measurement equipment, including any tools, fixtures, and gauges used to complete a measurement,
  • the environment (temperature, humidity, cleanliness, organization, etc.)
  • instructions, procedures, manuals, etc.,
  • data collection tools (paper and pen, computer and software, etc.),
  • any element, physical or otherwise (e.g. managerial pressure, time constraints), that can influence the performance of measurements, collection or analysis of data.
Repeatability and reproducibility each focus on one of these components.  For a GRR study to be valid, or useful, the remaining components must be held constant.
     Repeatability is an estimate of the variation in measurements induced by the measurement equipment – the equipment variation (EV).  This is also known as “within-system variation,” as no component of the system has changed.  Repeatability measures the variation in measurements taken by the same appraiser on the same part with the same equipment in the same environment following the same procedure.  Changing any of these components during the study invalidates the results.
     Reproducibility is an estimate of the variation in measurements induced by changing one component of the measurement system – the appraiser – while holding all others constant.  This is called appraiser variation (AV) for obvious reasons.  Reproducibility is also referred to as “between-systems variation” to reflect the ability to extend analysis to include multiple sites (i.e. “systems”).  This extended analysis is significantly more complex than a single-system evaluation and is beyond the scope of this presentation.
     Repeatability and reproducibility, collectively, are called “width variation.”  This term refers to the width of the normal distribution of measurement values obtained in a GRR study.  This is also known as measurement precision.  Use of a GRR study can guide improvement efforts that provide real value to an organization.  If resources are allocated to process modifications when it is the measurement system that is truly in need of improvement, the organization increases its costs without extracting significant value from the effort.  Misplaced attention can cause a superbly-performing process to be reported as mediocre – or worse.
 
Preparation for a GRR Study
     Conducting a successful GRR study requires organization and disciplined execution.  Several physical and procedural prerequisites must be satisfied to ensure valid and actionable results are obtained.  The first step is to assign a study facilitator to be responsible for preparation, execution, and interpretation.  The facilitator should have knowledge of the measurement process, the parts to be measured, and the significance of each step to be taken in preparation for and in the execution of the study.  This person will be responsible for maintaining the structure and discipline required to conduct a reliable study.
     Next, the measurement system to be evaluated must be defined.  That is, all components of the system, as discussed in the previous section, are to be identified.  This includes the environmental conditions under which measurements will be taken, part preparation requirements, relevant internal or external standards, and so on.  It is important to be thorough and accurate in this step; the definition of the system in use significantly influences subsequent actions and the interpretation of results.
     The next few steps are based directly on the system definition; they are not strictly sequential.  The stability of the system must be verified; that is, it must be known, prior to beginning a GRR study, that the components of the system will be constant for the duration of the study.  Process adjustment, tooling replacement, or procedural change that occurs mid-study will invalidate the results.
     The parts to be used in the study must be capable of withstanding several measurement cycles. If the measurement method or handling of parts required to complete it cannot be repeated numerous times without causing damage to the parts, other methods of evaluation may be required.
     Review historical data collected by the measurement system, as defined above, to confirm a normal distribution.  If the measurement data is not normally-distributed, there may be other issues (i.e. “special causes”) that require attention before a successful GRR can be conducted.  The more the data is skewed, or otherwise non-normal, the less reliable the GRR becomes.
     Once these prerequisites have been verified, the type of study to be conducted can be chosen.  A “short” study employs the range method, while a “long” study employs graphical analysis and the average and range method or the ANOVA method.  The flowchart presented in Exhibit 1 provides an aid to method selection.  Each method will be discussed further in subsequent sections on execution of GRR studies.
     The range of variation over which measurement system performance will be evaluated must also be defined.  Though other ranges can be used, we will focus on two:  process variation and specification tolerance.  Process variation is the preferred range, as it accurately reflects the realized performance of a process and its contribution to measurement variability.  The specification tolerance is a reasonable substitute in many instances, such as screening evaluations, or when comparisons of processes are to be made.
     Other procedural prerequisites include defining how the measurement sequence will be randomized and how data will be recorded (e.g. automatically or manually, forms to be used).   When the procedural activities are complete, physical preparations can begin.  This may include organizing the area around the subject measurement equipment to facilitate containment of sample parts; the study may require more part storage than is normally used, for example.  Participation of the facilitator may also require some accommodation.
     Physical preparations of sample part storage should include provisions for identification of parts.  Part identification should be known only to the facilitator, for purposes of administration and data collection, to avoid any potential influence on appraisers.
     Once part storage has been prepared, samples can be collected.  Samples should be drawn from standard production runs to ensure that they have been given no special attention that would influence the study.  The number of samples to be collected will be discussed in the sections pertaining to the different study methods.
     The final preparation steps are selection of appraisers and scheduling of measurements.  Appraisers are selected from the pool identified in the system definition and represent the range of experience or skill available.  Appraisers must be familiar with the system to be studied and perform the required measurements as part of their regular duties.
     Scheduling the measurements must allow for part “soak” time in the measurement environment, if required.  The appraisers’ schedules and other responsibilities must also be considered.  The facilitator should schedule the measurements to minimize the impact on normal operations.
     With all preparations complete, the GRR study begins in earnest.  The following three sections present the methods of conducting GRR studies.
 
Range Method
     The range method is called a “short” study because it uses fewer parts and appraisers and requires fewer calculations than other methods.  It does not evaluate repeatability and reproducibility separately; instead an approximation of measurement system variability (R&R) results.  The range method is often used as an efficient check of measurement system stability over time.
     Range method calculations require two appraisers to measure each of five parts in random order.  The measurements and preliminary calculations are presented conceptually in Exhibit 2.
The Gauge Repeatability and Reproducibility is calculated as follows:
     GRR =  R ̅∕d2* ,
where R ̅  is the average range of measurements from the data table (Exhibit 2) and d2* is found in the table in Exhibit 3.  For two appraisers (m = 2) measuring five parts (g = 5), d2* = 1.19105.
     The commonly-cited value is the percentage of measurement variation attributable to the measurement system, %GRR, calculated as follows:
     %GRR = 100% * (GRR/Process Standard Deviation),
where GRR is gauge repeatability and reproducibility, calculated above, and process standard deviation is determined from long-run process data.
     Interpretation of GRR results is typically consistent with established guidelines, as follows:
  • %GRR ≤ 10%:  measurement system is reliable.
  • 10% < %GRR ≤ 30%:  measurement system may be acceptable, depending on causes of variation and criticality of measured feature to product performance.
  • %GRR > 30%:  measurement system requires improvement.
Since the range method provides only a composite estimate of measurement system performance, additional analysis or use of another GRR study method may be required to identify specific improvement opportunities.
 
Average and Range Method
     The average and range method improves on the range method by providing independent evaluations of repeatability and reproducibility in addition to the composite %GRR value.  To perform a GRR study according to the average and range method, a total of 90 measurements are recorded.  Three appraisers, in turn, measure each of ten parts in random order.  Each appraiser repeats the set of measurements twice.
     For each set of measurements, a unique random order of parts should be used, with all measurement data hidden from appraisers.  Knowledge of other appraisers’ results, or their own previous measurement of a part, can directly or indirectly influence an appraiser’s performance; hidden measurement data prevents such influence.  For the same reason, appraisers should be instructed not to discuss their measurement results, techniques, or other aspects during the study.  After completion of the study, such discussion may be valuable input to improvement efforts; during the study, it only serves to reduce confidence in the validity of the results.
     Modifications can be made to the standard measurement sequence described above to increase the efficiency of the study.  Accommodations can be made for appraisers’ non-overlapping work schedules, for example, by recording each appraiser’s entire data set (10 parts x 3 iterations = 30 measurements) without intervening measurements by other appraisers.  Another example is an adaptation to fewer than 10 parts being available at one time.  In this situation, the previously described “round-robin” process is followed for the available parts.  When additional parts become available, the process is repeated until the data set is complete.
     A typical GRR data collection sheet is shown in Exhibit 4.  This form also includes several computations that will be used as inputs to the GRR calculations and graphical analysis.
     Graphical analysis of results is an important forerunner of numerical analysis.  It can be used to screen for anomalies in the data that indicate special-cause variation, data-collection errors, or other defect in the study.  Quickly identifying such anomalies can prevent effort from being wasted analyzing and acting on defective data.  Some example tools are introduced below, with limited discussion.  More information and additional tools can be found in the references or other sources.
     The average of each appraiser’s measurements is plotted for each part in the study on an average chart to assess between-appraiser consistency and discrimination capability of the measurement system.  A “stacked” or “unstacked” format can be used, as shown in Exhibit 5.

Exhibit 5:  Average Charts

Similarly, a range chart, displaying the range of each appraiser’s measurements for each part, in stacked or unstacked format, as shown in Exhibit 6, can be used to assess the measurement system’s consistency between appraisers.

Exhibit 6:  Range Charts

     Consistency between appraisers can also be assessed with X-Y comparison plots.  The average of each appraiser’s measurements for each part are plotted against those of each other appraiser.  Identical measurements would yield a 45° line through the origin.  An example of X-Y comparisons of three appraisers displayed in a single diagram is presented in Exhibit 7.
     A scatter plot, such as the example in Exhibit 8, can facilitate identification of outliers and patterns of performance, such as one appraiser that consistently reports higher or lower values than the others.  The scatter plot groups each appraiser’s measurements for a single part, then groups these sets per part.
     If no data-nullifying issues are discovered in the graphical analysis, the study proceeds to numerical analysis.  Calculations are typically performed on a GRR report form, such as that shown in Exhibit 9.  Values at the top of the form are transferred directly from the data collection sheet (Exhibit 4).  Complete the calculations prescribed in the left-hand column of the GRR report form; the values obtained can then be used to complete the right-hand column.  The formulas provided on both forms result in straightforward calculations; therefore, we will focus on the significance of the results rather their computation.
     The right-hand column of the GRR report contains the commonly-cited values used to convey the effectiveness of measurement systems.  The following discussion summarizes each and, where applicable, offers potential targets for improvement.
     Equipment variation (%EV) is referred to as gauge repeatability, our first “R.”  A universally-accepted limit on repeatability is not available; judgment in conjunction with other relevant information is necessary.  If repeatability is deemed unacceptable, or in need of improvement, potential targets include:
  • within-part variation; e.g. roundness of shaft will affect diameter measurement,
  • wear, distortion, or poor design or fabrication of fixtures or other components,
  • contamination of parts or equipment,
  • damage or distortion of parts,
  • parallax error (analog gauges),
  • overheating of electronic circuits,
  • mismatch of gauge and application.
     Appraiser variation (%AV) is our second “R,” reproducibility.  Like repeatability, it must be judged in conjunction with other information.  If AV is excessive, the measurement process must be observed closely to identify its sensitivities.  Potential sources of variation include:
  • differing techniques used by appraisers to perform undefined portions of the process; (e.g. material handling),
  • quality of training, completeness and clarity of instructions,
  • physical differences of appraisers that can affect placement of parts or operation of equipment (i.e. ergonomics); (e.g. height, strength, handedness),
  • wear, distortion, or poor design or fabrication of fixtures that allows differential location of parts,
  • mismatch of gauge and application,
  • time-based environmental fluctuations; e.g. study conducted across shifts in a facility with day/night HVAC settings.
     The composite value, repeatability and reproducibility (R&R, %GRR) is often the starting point of measurement system discussions due, in large part, to the well-documented and broadly-accepted acceptance criteria (see Range Method section above).  To lower %GRR, return to EV and AV to identify potential improvements.  One of the components may be significantly larger than the other, making it the logical place to begin the search for improvements to the measurement system.
     Part variation (%PV) is something of a counterpoint to R&R.  when %GRR is in the 10 – 30% range, where the system may be acceptable, high %PV can be cited to support acceptance of the measurement system.  If equipment and appraisers contribute zero variability to measurements, %PV = 100%.  This, of course, does not occur in real-world applications; it is an asymptotic target.
     The final calculation on the GRR report, the number of distinct categories (ndc), is an assessment of the measurement system’s ability to distinguish parts throughout the range of variation.  Formally, it is “the number of non-overlapping 97% confidence intervals that will span the expected product variation.”  The higher the ndc, the greater the discrimination capability of the system.  The calculated value is truncated to an integer and should be 5 or greater to ensure a reliable measurement system.
 
     To conclude this section, three important notes need to be added.  First, nomenclature suggests that a GRR study is complete with the calculation of %EV, %AV, and %GRR.  However, %PV and ndc are included in a “standard” GRR study to provide additional insight to a measurement system’s performance, facilitating its evaluation and acceptance or improvement.
      Second, some sources refer to %GRR as the Precision to Tolerance, or P/T, Ratio.  Different terminology, same calculation.
     The final note pertains to the evaluation of a system with respect to the specification tolerance instead of the process variation, as discussed in the Preparation for a GRR Study section.  If the specification tolerance (ST) is to be the basis for evaluation, the calculations of %EV, %AV, %GRR, and %PV on the GRR report (Exhibit 9, right-hand column) are to be made with TV replaced by ST/6.  Judgment of acceptability must also be adjusted to account for the type of analysis conducted.
 
Analysis of Variance Method
     The analysis of variance method (ANOVA) is more accurate and more complex than the previous methods discussed.  It adds the ability to assess interaction effects between appraisers and parts as a component of measurement variation.  A full exposition is beyond the scope of this presentation; we will instead focus on its practical application.
     The additional information available in ANOVA expands the possibilities for graphical analysis.  An interaction plot can be used to determine if appraiser-part interaction effects are significant.  Each appraiser’s measurement average for each part is plotted; data points for each appraiser are connected by a line, as shown in the example in Exhibit 10.  If the lines are parallel, no interaction effects are indicated.  If the lines are not parallel, the extent to which they are non-parallel indicates the significance of the interaction.
     To verify that gauge error is a normally-distributed random variable (an analysis assumption), a residuals plot can be used.  A residual is the difference between an appraiser’s average measurement of a part and an individual measurement of that part.  When plotted, as shown in the example in Exhibit 11, the residuals should be randomly distributed on both sides of zero.  If they are not, the cause of skewing should be investigated and corrected.
     Numerical analysis is more cumbersome in ANOVA than the other methods discussed.  Ideally, it is performed by computer to accelerate analysis and minimize errors.  Calculation formulas are summarized in the ANOVA table shown in Exhibit 12.  A brief description of each column in the table follows.
  • Source:  source, or cause, of variation.
  • DFdegree of freedom attributed to the source, where k = the number of appraisers, n = the number of parts, and r = the number of trials.
  • SS: sum of squares; the deviation from the mean of the source.  Calculations are shown in Exhibit 13.
  • MSmean square; MS = SS/DF.
  • FF-ratio; F = MSAP/MSe; higher values indicate greater significance.
  • EMSexpected mean square; the linear (additive) combination of variance components.  ANOVA differentiates four components of variation – repeatability (EV), parts (PV), appraisers (AV), and interaction of appraisers and parts (INT).  The calculations used to estimate these variance components are shown in Exhibit 14.  Negative components of variance should be set to zero for purposes of the analysis.
     Finally, calculations analogous to those in the right-hand column of the GRR report used in the average and range method (Exhibit 9) can be performed.  These calculations, shown in Exhibit 15, define measurement system variation in terms of a 5.15σ spread (“width”), or a 99% range.  This range can be expanded to 99.73% (6σ spread) by substituting 5.15 with 6 in the calculations.  The ubiquity of “six sigma” programs may make this option easier to recall and more intuitive, facilitating use of the tool for many practitioners.
     The notes at the conclusion of the Average and Range Method section are also applicable to ANOVA.  The additional calculations are shown, with ANOVA nomenclature, in Exhibit 16.  The acceptance criteria also remain the same.  The advantage that ANOVA provides is the insight into interaction effects that can be explored to identify measurement system improvement opportunities.
     The three methods of variable gauge repeatability and reproducibility study discussed – range method, average and range method, and ANOVA – can be viewed as a progression.  As measured features become more critical to product performance and customer satisfaction, the measurement system requires greater attention; that is, more accurate and detailed analysis is required to ensure reliable performance.
     The progression, or hierarchy, of methods is also useful for those new to GRR studies, as it allows basic concepts to be learned, then built upon.  Only an introduction was feasible here, particularly with regards to ANOVA.  Consult the references listed below, and other sources on quality and statistics, for more detailed information.
 
     JayWink Solutions awaits the opportunity to assist you and your organization with your quality and operational challenges.  Feel free to contact us for a consultation.
 
     For a directory of “The War on Error” volumes on “The Third Degree,” see “Vol. I:  Welcome to the Army.
 
References
[Link] “Statistical Engineering and Variation Reduction.”  Stefan H. Steiner and R. Jock MacKay;  Quality Engineering, 2014.
[Link] “An Overview of the Shainin SystemTM for Quality Improvement.”  Stefan H. Steiner, R. Jock MacKay, and John S. Ramberg;  Quality Engineering, 2008.
[Link] “Measurement Systems Analysis,” 3ed.  Automotive Industry Action Group, 2002.
[Link] “Introduction to the Gage R & R.”  Wikilean.
[Link] “Two-Way Random-Effects Analyses and Gauge R&R Studies.”  Stephen B. Vardeman and Enid S. VanValkenburg; Technometrics, August 1999
[Link] “Discussion of ‘Statistical Engineering and Variation Reduction.’”  David M. Steinberg;  Quality Engineering, 2014.
[Link] “Conducting a Gage R&R.”  Jorge G. Tavera Sainz; Six Sigma Forum Magazine, February 2013.
[Link] Basic Business Statistics for Managers.  Alan S. Donnahoe; John Wiley & Sons, Inc., 1988.
[Link] Creating Quality.  William J. Kolarik; McGraw-Hill, Inc., 1995.

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>
<![CDATA[The War on Error – Vol. III:  A Tale of Two Journeys]]>Wed, 29 Jul 2020 14:30:00 GMThttp://jaywinksolutions.com/thethirddegree/the-war-on-error-vol-iii-a-tale-of-two-journeys     There is a “universal sequence for quality improvement,” according to the illustrious Joseph M. Juran, that defines the actions to be taken by any team to effect change.  This includes teams pursuing error- and defect-reduction initiatives, variation reduction, or quality improvement by any other description.
            Two of the seven steps of the universal sequence are “journeys” that the team must take to complete its problem-solving mission.  The “diagnostic journey” and the “remedial journey” comprise the core of the problem-solving process and, thus, warrant particular attention.
The Universal Sequence
     The universal sequence defines a problem-solving framework, of which, the journeys are critical elements.  Before embarking on discussions of the journeys, readers should be familiar with the framework and the journeys’ place in it.  For this purpose, a very brief introduction to the seven steps of the universal sequence follows:
(1) Provide proof of the need.  Present data and information that demonstrates the need for improvement; typically, this is translated into financial terms (e.g. scrap cost, contribution margin, etc.) for management review.
(2) Identify project(s).  Define projects in terms of specific problems that create the need and link them to the breakthrough desired.
(3) Organize for improvement.  Assign responsibility for high-level oversight of the improvement program (sponsors that authorize funds) and management of each project; assign team members to execute project tasks.
(4) Take the diagnostic journey.  Identify the root cause of the problem to be solved.
(5) Take the remedial journey.  Implement corrective action(s) to eliminate the root cause.
(6) Overcome resistance to change.  Communicate with those affected by the change, encourage their participation in the transformation process, and allow them time to adjust.
(7) Sustain the improvement.  Update or create new standards and monitoring and controlling systems to prevent regression.
     With the framework of the universal sequence now defined, it is time for a more detailed look at the diagnostic and remedial journeys.
 
The Diagnostic Journey
     The diagnostic journey takes a problem-solving team “from symptom to cause.”  The first leg of this journey consists of understanding the symptoms of a problem by analyzing two forms of evidence.  First, the team must ensure that all parties involved use shared definitions of special and common terms used to describe a situation, defect, etc.  It is imperative that the words used to communicate within the team provide clarity; inconsistent use of terminology can cause confusion and misunderstanding.  Second, autopsies (from Greek autopsia – “seeing with one’s own eyes”) should be conducted to deepen understanding of the situation.  An autopsy can also confirm that the use of words is consistent with the agreed definition, or offer an opportunity to make adjustments in the early stages of the journey.
     Understanding the symptoms allows the team to begin formulating theories about the root cause.  This can be done with a simple brainstorming session or other idea-generation and collection technique.  A broad array of perspectives is helpful (e.g. quality, operations, management, maintenance, etc.).
     Processing a large number of theories can be simplified by organizing them in a fishbone diagram, mind map, or other format.  Grouping similar or related ideas, and creating a visual representation of their interrelationships, facilitates the selection of theories to be tested and experimental design.  Visualized grouping of suspect root causes aids the design of experiments that can test multiple theories simultaneously, contributing to an efficient diagnosis.
     Once the investigation has successfully identified the root cause, the diagnostic journey is complete.  The team must transition and embark on the remedial journey.
 
The Remedial Journey
     The remedial journey takes the team “from cause to remedy;” it begins with the choice of alternatives.  A list of alternative solutions, or remedies, should be compiled for evaluation.  Remedies should be proposed considering the hierarchy (preference) or error-reduction mechanisms discussed in “Vol. II:  Poka Yoke (What Is and Is Not).”  All proposed remedies should satisfy the following three criteria:
  • it eliminates or neutralizes the root cause;
  • it optimizes cost;
  • it is acceptable to decision-makers.
The first and third criteria are straightforward enough – if it does not remove the root cause, it is not a remedy; if it is not acceptable to those in charge, it will not be approved for implementation.  Cost optimization, however, is a bit more complex.  Life-cycle costs, opportunity costs, transferred costs, and the product’s value proposition or market position all need to be considered.  See “The High Cost of Saving Money” and “4 Characteristics of an Optimal Solution” for more on these topics.
     To verify that a proposed remedy, in fact, eliminates the root cause, “proof-of-concept” testing should be conducted.  This is done on a small scale, either in a laboratory setting or in the production environment, minimizing disruption to the extent possible.  If successful on a small scale, implementation of the remedy can be ramped up in production.  Successful full-scale implementation should be accompanied by updates to instructions, expanded training, and other activities required to normalize the new process or conditions.
     Normalization activities taking place during the remedial journey may overlap with the last two steps of the universal sequence for quality improvement; this overlap may help to ensure that the sequence is completed.  The organization can fully capitalize on the effort only when the entire sequence has been completed.  Repeating the sequence is evidence of a continuous improvement culture, whether nascent or mature.
 
     The diagnostic and remedial journeys are defined as they are to emphasize three critical, related characteristics.  First, the diagnostic and remedial journeys are separate, independent endeavors.  Each requires its own process to complete successfully.  Second, both are required components of an effective quality-improvement or problem-solving project.  Finally, the remedial journey should not commence until the diagnostic journey is complete – diagnosis precedes remedy.
     These points may seem obvious or unremarkable; however, the tendency of “experts” to jump directly from symptom to remedy is too common to ignore.  Doing so fails to incorporate all available evidence, allowing bias and preconceived notions to drive decisions.  The danger of irrational decision-making – acting on willfully incomplete information – is a theme that runs throughout the “Making Decisions” series on “The Third Degree.”
     The diagnostic and remedial journeys are also described as the core of a successful problem-solving process, but not its entirety.  Each step in the universal sequence, when performed conscientiously, improves team effectiveness, increasing efficiency and probability of project success.
 
     General questions are welcome in the comments section below.  To address specific needs of your organization, please contact JayWink Solutions for a consultation.
 
     For a directory of “The War on Error” volumes on “The Third Degree,” see “Vol. I:  Welcome to the Army.”
 
References
[Link] Juran’s Quality Handbook.  Joseph M. Juran et al; McGraw-Hill.
[Link] “Quality Tree:  A Systematic Problem-Solving Model Using Total Quality Management Tools and Techniques.”  J. N. Pan and William Kolarik; Quality Engineering, 1992.
[Link] “Shainin Methodology:  An Alternative or an Effective Complement to Six Sigma?”  Jan Kosina; Quality Innovation Prosperity, 2015.
[Link] “Statistical Engineering and Variation Reduction.”  Stefan H. Steiner and R. Jock MacKay;  Quality Engineering, 2014.
[Link] “An Overview of the Shainin SystemTM for Quality Improvement.”  Stefan H. Steiner, R. Jock MacKay, and John S. Ramberg;  Quality Engineering, 2008.
[Link] “7 Stages of Universal Break-Through Sequence for an Organisation.”  Smriti Chand

 
Jody W. Phelps, MSc, PMP®, MBA
Principal Consultant
JayWink Solutions, LLC
jody@jaywink.com
]]>