Friday, June 1, 2018

Root Cause Failure Analysis: Unknown knowns


In a fragile world, as engineers, we need to know the pathology of failures. It may be a structural failure, or material failure, design failure, or any other kinds of failures; regardless to the type, category or discipline, there must be an appropriate way of systematic investigation of the root cause of failure.

In principle, there are three categories of which a failure trigger could be located, in another word, Failure Cause Characterization (Márquez, 2007):

1. The human cause: includes the human errors such as omission or commission resulting in physical roots.

2. The physical cause: the reason why the asset failed; the technical explanation of why things broke or failed.

3. The latent cause: the deficiencies in the management systems that allow the human errors to continue unchecked, i.e. the flaws in the systems and procedures.

In the context of asset management, where the framework adheres to the Asset Management ISO 55000 series, it is a fundamental requirement to assess the assets for their performance evaluation criteria set in clause 2.5.3.7:

“Asset management performance should be evaluated against whether the asset management objectives have been achieved, and if not, why not. Where applicable, any opportunities that arose from having exceeded the asset management objectives should also be examined, as well as any failure to realize them. The adequacy of the decision-making processes should be examined carefully.”

This discloses that the importance of the latent cause factor. In asset management systems, it shall pay a big role for the RCFA process where usually set standard procedures for the investigations. In contrast, asset failure reporting systems and the investigation methodologies, and lessons learned procedures are in the basic requirements framework for ISO 9000 as well.

The basic setup for the contents of an investigation report could be as following, in its most relevant minimum components:

1. Define the investigation team. This team could include the Subject Matter Experts (SMEs), Consultants, Engineers, Operators, and Technicians etc.

2. Recollection of failure data: identify the problems, date of failure or incident, GIS information etc.

3. Evaluation of the impact and immediate actions taken to rectify the failure.

4. Date of the Incident Report

5. Date of Investigation Report

6. Failed asset details

7. The root cause of the failure

8. Recommendations

9. Al the relevant attachments: photos, records, emails, previous correspondences, early warning issues, etc.

10. Final inspection dates or witnessing dates for repair works or tests, if applicable.

11. Lessons Learned

Root cause analysis could be performed within the standard methodologies detailed in BS EN 62740 which outlines the current good practices in the conduct of root cause analyses.

In this context of RCA, the validation is the most critical aspect, and this shall be treated with extreme care. Usually, a third party consultation of the RCA could happen in reality and treat it as an independent review or assessment of the purpose of the analysis either had been a valid and affirmative for the use of determining future corrective actions and fulfilled the entirety of the objectives of RCA. Alternatively, experiments or testing procedures could be in helping to validate the RCA outcome. This could demonstrate the causal event, the root cause is real and trigger the consequences which RCA reveals in-depth by the mechanisms that report proposed. Other than the third party consultation or the experimental basis of validations, there could be statistical approaches might help in validation of the focus event in certain failure scenarios, validation of hypothesis of the root cause. This method might involve numerical modeling of the failure root cause, sometimes it could be a Monte-Carlo simulation or other probabilistic models. In this simulation methods, extreme care must be taken to model the root cause in a realistic and a representative manner.

In addition to the entire technical round trip above, there shall be always the applicability of limits of human knowledge.

As we all know, following analogy might shed some light on how important that the intuition and experience may take the lead of such investigation, ultimately:

Known knowns; this is the entire knowledge consciously applicable to the case, the theories, techniques, observations, evidence, facts, etc.

Known unknowns; the limits of theories, approximations, assumptions, errors, model approximations, etc. etc.

Unknown unknowns; third-party involvements without a trace, perfect murder kind of events where no leads remain for investigations, no facts, no CCTV footages, complete blind events that might occur without notice, etc.

Unknown knowns (Zizek, 2012): The human intuition where the past experiences are collected, processed, sediment into the deep layers of subconscious mind and resides therein but never come into existence in conscious level recalling. The agency where we have no control but controls us in unconscious level: The Unconscious!

Beyond the limits of conscious capacities, the unconscious plays a vital role in our intuition which always backing up our thought process in decision making. Hence, the first things come to mind are not so irrational, but with more automated, calculated outcomes during our conscious attempts to look into a solution to a situation, silently. 

So it is essential to keep up to date the unconscious, by reading avidly, gaining stuff uploaded into the unconscious continuously, and wait till unconscious know things better and process silently, and come into existence while we are completely away from the focus of the objectives. In psychoanalytic context, the dreams are one way of unconscious communications to the conscious mind, so believe in dreams will shorten certain hard decisions made. Sometimes, maybe daydreaming could be accountable too. 

The arena of the knowledge we possess inside us which is supposed to be consciously unknown is the category “unknown knowns” which dominates our decisions and actions. This hidden resource could be evolved day by day feeding much as possible information into it consciously by reading, viewing, listening, acting, doing, even by thinking!  


Most expert judgments have no much evidence in reality or nearly impossible to prove by facts. Experts trying to rely on intuition which has been developed years of extensive engagements and experiences, failures, etc., but it is the spirit of unconscious knowns or conscious-unknowns, in fact, the unknown-knowns.

References


Márquez, A. C. (2007). Root Cause Failure Analysis (RCFA) for High Impact Weak Points. In A. C. Márquez, The Maintenance Management Framework-Models and Methods for Complex Systems Maintenance (pp. 127-132). London: Springer.

Zizek, S. (2012). The Limits of Hegel . In S. Zizek, Less Than Nothing - Hegel and the shadow of dialectical materialism (p. 358). London: VERSO.