The Space Shuttle Disasters and Quality Management
September 7, 2018
On January 28, 1986, the Space Shuttle Challenger was destroyed 73 seconds after lifting off from Cape Canaveral, Florida. Seven crew members died, a $3 billion-dollar orbital vehicle was lost, and NASA’s Space Shuttle program was suspended for 32 months.
The official cause of the disaster was the failure of an O-ring to prevent hot gases from leaking through the joint in the solid rocket motor during launch.[i] The Rogers Commission – the body tasked with investigating the disaster – found that the O-ring design had been a point of concern for several years prior to the disaster, but that any concerns had been either poorly communicated or ignored in favor of maintaining project delivery on-time and on-budget.[ii]
In addition to the faulty initial design of the O-rings, the Commission determined that the unusually cold temperatures at the time of the launch (conditions in which none of the dependent systems on the Space Shuttle had ever been tested) meant that the rubber O-rings became inflexible and allowed the flow of gas to escape and ignite, a failure demonstrated by committee member Richard Feynman on live television during the inquiry. NASA had observed O-rings behaving in unusual and unanticipated ways during previous flights but had made the decision that as long as there was no cataclysmic failure of the equipment, this was an acceptable deviation, a phenomenon referred to as “normalization of deviance.”
Feynman produced an appendix to the final report in which he wrote: “It appears that there are enormous differences of opinion as to the probability of a failure with loss of vehicle and of human life. The estimates range from roughly 1 in 100 to 1 in 100,000. The higher figures come from working engineers, and the very low figures from management. What are the causes and consequences of this lack of agreement?”[iii] According to post-disaster analysis, NASA’s management culture in the mid-1980s was strongly biased against the methods of risk assessment that would have highlighted the likelihood of a disaster.[iv]
The Challenger disaster is a failure of NASA’s overall Quality Management System (QMS), particularly the Culture of Quality.[v] The fact that the design flaw was a known defect but was incorrectly categorized as an acceptable risk, combined with a management structure replete with communications flaws that allowed managers to bypass Quality Management procedures, meant that NASA’s QMS was ill-equipped to prevent or manage a disaster of that scale.
Several of these QMS failures were cited as having a direct impact on the destruction of the Space Shuttle Columbia on February 1, 2003, the definitive cause of which was the impact of a piece of dislodged foam on the left wing of the vehicle during launch. This impact created a breach in the thermal protection system of the wing which, during reentry, allowed superheated air to enter the panels, which subsequently led to the destruction of the vehicle. The result in this case was the loss of seven crew, the destruction of the Space Shuttle Columbia, and the dismantling of the entire Space Shuttle program.
The report[vi] from the Columbia Accident Investigation Board cited poor risk-assessment, lack of managerial interest in promoting safety and Quality, overly simple presentation[vii] of complex information required for decision-making[viii], and normalization of deviance as significant contributing factors, showing that even such a cataclysmic event as the Challenger disaster is sometimes not enough to demonstrate the importance of a QMS to organizations with deeply entrenched process failures.
[i] Report of the Presidential Commission on the Space Shuttle Challenger Accident, Chapter IV (June 6, 1986), accessed March 9, 2018, https://history.nasa.gov/rogersrep/v1ch4.htm.
[ii] Report of the Presidential Commission on the Space Shuttle Challenger Accident, Chapter VI (June 6, 1986), accessed March 9, 2018, https://history.nasa.gov/rogersrep/v1ch6.htm.
[iii] Report of the Presidential Commission on the Space Shuttle Challenger Accident, Volume 2, Appendix F (June 6, 1986), accessed March 9, 2018, https://history.nasa.gov/rogersrep/v2appf.htm.
[iv] Trudy E. Bell and Karl Esch, “The Challenger Disaster: A Case of Subjective Engineering,” IEEE Spectrum, January 28, 2016, accessed March 9, 2018, https://spectrum.ieee.org/tech-history/heroic-failures/the-space-shuttle-a-case-of-subjective-engineering.
[v] Report of the Presidential Commission on the Space Shuttle Challenger Accident, Chapter V (June 6, 1986), accessed March 9, 2018, https://history.nasa.gov/rogersrep/v1ch5.htm.
[vi] Report of Columbia Accident Investigation Board, Volume 1, Chapter 3 (August 26, 2003), accessed March 9, 2018, http://s3.amazonaws.com/akamai.netstorage/anon.nasa-global/CAIB/CAIB_lowres_chapter3.pdf.
[vii] Edward Tufte, “PowerPoint Does Rocket Science – and Better Techniques for Technical Reports,” accessed March 9, 2018, https://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0001yB.
[viii] Report of Columbia Accident Investigation Board, Volume 1, Chapter 7 (August 26, 2003), accessed March 9, 2018, http://s3.amazonaws.com/akamai.netstorage/anon.nasa-global/CAIB/CAIB_lowres_chapter7.pdf.