Several earlier posts have made the point that important technology failures often include organizational faults in their causal background.
It is certainly true that most important accidents have multiple causes, and it is crucial to have as good an understanding as possible of the range of causal pathways that have led to air crashes, chemical plant explosions, or drug contamination incidents. But in the background we almost always find organizations and practices through which complex technical activities are designed, implemented, and regulated. Human actors, organized into patterns of cooperation, collaboration, competition, and command, are as crucial to technical processes as are power lines, cooling towers, and control systems in computers. So it is imperative that we follow the lead of researchers like Charles Perrow (The Next Catastrophe: Reducing Our Vulnerabilities to Natural, Industrial, and Terrorist Disasters), Kathleen Tierney (The Social Roots of Risk: Producing Disasters, Promoting Resilience), or Diane Vaughan (The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA) and give close attention to the social- and organization-level failures that sometimes lead to massive technological failures.
It is useful to have a few examples in mind as we undertake to probe this question more deeply. Here are a number of important accidents and failures that have been carefully studied.
- Three Mile Island, Chernobyl nuclear disasters
- Challenger and Columbia space shuttle disasters
- Failure of United States anti-submarine warfare in 1942-43
- Flawed policy and decision-making in US leading to escalation of Vietnam War
- Flawed policy and decision-making in France leading to Dien Bien Phu defeat
- Failure of Nuclear Regulatory Commission to ensure reactor safety
- DC-10 design process
- Osprey design process
- failure of Federal flood insurance to appropriately guide rational land use
- FEMA failure in Katrina aftermath
- Design and manufacture of the Edsel sedan
- High rates of hospital-born infections in some hospitals
Examples like these allow us to begin to create an inventory of organizational flaws that sometimes lead to failures and accidents:
- siloed decision-making (design division, marketing division, manufacturing division all have different priorities and interests)
- lax implementation of formal processes
- strategic bureaucratic manipulation of outcomes
- information withholding, lying
- corrupt practices, conflicts of interest and commitment
- short-term calculation of costs and benefits
- indifference to public goods
- poor evaluation of data; misinterpretation of data
- lack of high-level officials responsible for compliance and safety
These deficiencies may be analyzed in terms of a more abstract list of organizational failures:
- Poor decisions given existing priorities and facts
- poor priority-setting processes
- poor information-gathering and analysis
- failure to learn and adapt from changing circumstances
- internal capture of decision-making; corruption, conflict of interest
- vulnerability of decision-making to external pressures (external capture)
- faulty or ineffective implementation of policies, procedures, and regulations
Nancy Leveson is a leading authority on the systems-level causes of accidents and failures. A recent white paper can be found here. Here is the abstract for that paper:
New technology is making fundamental changes in the etiology of accidents and is creating a need for changes in the explanatory mechanisms used. We need better and less subjective understanding of why accidents occur and how to prevent future ones. The most effective models will go beyond assigning blame and instead help engineers to learn as much as possible about all the factors involved, including those related to social and organizational structures. This paper presents a new accident model founded on basic systems theory concepts. The use of such a model provides a theoretical foundation for the introduction of unique new types of accident analysis, hazard analysis, accident prevention strategies including new approaches to designing for safety, risk assessment techniques, and approaches to designing performance monitoring and safety metrics. (1; italics added)
Here is what Leveson has to say about the social and organizational causes of accidents:
2.1 Social and Organizational Factors
Event-based models are poor at representing systemic accident factors such as structural deficiencies in the organization, management deficiencies, and flaws in the safety culture of the company or industry. An accident model should encourage a broad view of accident mechanisms that expands the investigation from beyond the proximate events.
Ralph Miles Jr., in describing the basic concepts of systems theory, noted that:
Underlying every technology is at least one basic science, although the technology may be well developed long before the science emerges. Overlying every technical or civil system is a social system that provides purpose, goals, and decision criteria (Miles, 1973, p. 1).
Effectively preventing accidents in complex systems requires using accident models that include that social system as well as the technology and its underlying science. Without understanding the purpose, goals, and decision criteria used to construct and operate systems, it is not possible to completely understand and most effectively prevent accidents. (6)