How organizations adapt

Organizations do things; they depend upon the coordinated efforts of numerous individuals; and they exist in environments that affect their ongoing success or failure. Moreover, organizations are to some extent plastic: the practices and rules that make them up can change over time. Sometimes these changes happen as the result of deliberate design choices by individuals inside or outside the organization; so a manager may alter the rules through which decisions are made about hiring new staff in order to improve the quality of work. And sometimes they happen through gradual processes over time that no one is specifically aware of. The question arises, then, whether organizations evolve toward higher functioning based on the signals from the environments in which they live; or on the contrary, whether organizational change is stochastic, without a gradient of change towards more effective functioning? Do changes within an organization add up over time to improved functioning? What kinds of social mechanisms might bring about such an outcome?

One way of addressing this topic is to consider organizations as mid-level social entities that are potentially capable of adaptation and learning. An organization has identifiable internal processes of functioning as well as a delineated boundary of activity. It has a degree of control over its functioning. And it is situated in an environment that signals differential success/failure through a variety of means (profitability, success in gaining adherents, improvement in market share, number of patents issued, …). So the environment responds favorably or unfavorably, and change occurs.

Is there anything in this specification of the structure, composition, and environmental location of an organization that suggests the possibility or likelihood of adaptation over time in the direction of improvement of some measure of organizational success? Do institutions and organizations get better as a result of their interactions with their environments and their internal structure and actors?

There are a few possible social mechanisms that would support the possibility of adaptation towards higher functioning. One is the fact that purposive agents are involved in maintaining and changing institutional practices. Those agents are capable of perceiving inefficiencies and potential gains from innovation, and are sometimes in a position to introduce appropriate innovations. This is true at various levels within an organization, from the supervisor of a custodial staff to a vice president for marketing to a CEO. If the incentives presented to these agents are aligned with the important needs of the organization, then we can expect that they will introduce innovations that enhance functioning. So one mechanism through which we might expect that organizations will get better over time is the fact that some agents within an organization have the knowledge and power necessary to enact changes that will improve performance, and they sometimes have an interest in doing so. In other words, there is a degree of intelligent intentionality within an organization that might work in favor of enhancement.

This line of thought should not be over-emphasized, however, because there are competing forces and interests within most organizations. Previous posts have focused on current organizational theory based on the idea of a “strategic action field” of insiders and outsiders who determine the activities of the organization (Fligstein and McAdam, Crozier; linklink). This framework suggests that the structure and functioning of an organization is not wholly determined by a single intelligent actor (“the founder”), but is rather the temporally extended result of interactions among actors in the pursuit of diverse aims. This heterogeneity of purposive actions by actors within an institution means that the direction of change is indeterminate; it is possible that the coalitions that form will bring about positive change, but the reverse is possible as well.

And in fact, many authors and participants have pointed out that it is often enough not the case that the agents’ interests are aligned with the priorities and needs of the organization. Jack Knight offers persuasive critique of the idea that organizations and institutions tend to increase in their ability to provide collective benefits in Institutions and Social Conflict. CEOs who have a financial interest in a rapid stock price increase may take steps that worsen functioning for shortterm market gain; supervisors may avoid work-flow innovations because they don’t want the headache of an extended change process; vice presidents may deny information to other divisions in order to enhance appreciation of the efforts of their own division. Here is a short description from Knight’s book of the way that institutional adjustment occurs as a result of conflict among players of unequal powers:

Individual bargaining is resolved by the commitments of those who enjoy a relative advantage in substantive resources. Through a series of interactions with various members of the group, actors with similar resources establish a pattern of successful action in a particular type of interaction. As others recognize that they are interacting with one of the actors who possess these resources, they adjust their strategies to achieve their best outcome given the anticipated commitments of others. Over time rational actors continue to adjust their strategies until an equilibrium is reached. As this becomes recognized as the socially expected combination of equilibrium strategies, a self-enforcing social institution is established. (Knight, 143)

A very different possible mechanism is unit selection, where more successful innovations or firms survive and less successful innovations and firms fail. This is the premise of the evolutionary theory of the firm (Nelson and Winter, An Evolutionary Theory of Economic Change). In a competitive market, firms with low internal efficiency will have a difficult time competing on price with more efficient firms; so these low-efficiency firms will go out of business occasionally. Here the question of “units of selection” arises: is it firms over which selection operates, or is it lower-level innovations that are the object of selection?

Geoffrey Hodgson provides a thoughtful review of this set of theories here, part of what he calls “competence-based theories of the firm”. Here is Hobson’s diagram of the relationships that exist among several different approaches to study of the firm.

The market mechanism does not work very well as a selection mechanism for some important categories of organizations — government agencies, legislative systems, or non-profit organizations. This is so, because the criterion of selection is “profitability / efficiency within a competitive market”; and government and non-profit organizations are not importantly subject to the workings of a market.

In short, the answer to the fundamental question here is mixed. There are factors that unquestionably work to enhance effectiveness in an organization. But these factors are weak and defeasible, and the countervailing factors (internal conflict, divided interests of actors, slackness of corporate marketplace) leave open the possibility that institutions change but they do not evolve in a consistent direction. And the glaring dysfunctions that have afflicted many organizations, both corporate and governmental, make this conclusion even more persuasive. Perhaps what demands explanation is the rare case where an organization achieves a high level of effectiveness and consistency in its actions, rather than the many cases that come to mind of dysfunctional organizational activity.

(The examples of organizational dysfunction that come to mind are many — the failures of nuclear regulation of the civilian nuclear industry (Perrow, The Next Catastrophe: Reducing Our Vulnerabilities to Natural, Industrial, and Terrorist Disasters); the failure of US anti-submarine warfare in World War II (Cohen, Military Misfortunes: The Anatomy of Failure in War); and the failure of chemical companies to ensure safe operations of their plants (Shrivastava, Bhopal: Anatomy of Crisis). Here is an earlier post that addresses some of these examples; link. And here are several earlier posts on the topic of institutional change and organizational behavior; linklink.)

Advertisements

System safety engineering and the Deepwater Horizon

The Deepwater Horizon oil rig explosion, fire, and uncontrolled release of oil into the Gulf is a disaster of unprecedented magnitude.  This disaster in the Gulf of Mexico appears to be more serious in objective terms than the Challenger space shuttle disaster in 1986 — in terms both of immediate loss of life and in terms of overall harm created. And sadly, it appears likely that the accident will reveal equally severe failures of management of enormously hazardous processes, defects in the associated safety engineering analysis, and inadequacies of the regulatory environment within which the activity took place.  The Challenger disaster fundamentally changed the ways that we thought about safety in the aerospace field.  It is likely that this disaster too will force radical new thinking and new procedures concerning how to deal with the inherently dangerous processes associated with deep-ocean drilling.

Nancy Leveson is an important expert in the area of systems safety engineering, and her book, Safeware: System Safety and Computers, is a genuinely important contribution.  Leveson led the investigation of the role that software design might have played in the Challenger disaster (link).  Here is a short, readable white paper of hers on system safety engineering (link) that is highly relevant to the discussions that will need to occur about deep-ocean drilling.  The paper does a great job of laying out how safety has been analyzed in several high-hazard industries, and presents a set of basic principles for systems safety design.  She discusses aviation, the nuclear industry, military aerospace, and the chemical industry; and she points out some important differences across industries when it comes to safety engineering.  Here is an instructive description of the safety situation in military aerospace in the 1950s and 1960s:

Within 18 months after the fleet of 71 Atlas F missiles became operational, four blew up in their silos during operational testing. The missiles also had an extremely low launch success rate.  An Air Force manual describes several of these accidents: 

     An ICBM silo was destroyed because the counterweights, used to balance the silo elevator on the way up and down in the silo, were designed with consideration only to raising a fueled missile to the surface for firing. There was no consideration that, when you were not firing in anger, you had to bring the fueled missile back down to defuel. 

     The first operation with a fueled missile was nearly successful. The drive mechanism held it for all but the last five feet when gravity took over and the missile dropped back. Very suddenly, the 40-foot diameter silo was altered to about 100-foot diameter. 

     During operational tests on another silo, the decision was made to continue a test against the safety engineer’s advice when all indications were that, because of high oxygen concentrations in the silo, a catastrophe was imminent. The resulting fire destroyed a missile and caused extensive silo damage. In another accident, five people were killed when a single-point failure in a hydraulic system caused a 120-ton door to fall. 

     Launch failures were caused by reversed gyros, reversed electrical plugs, bypass of procedural steps, and by management decisions to continue, in spite of contrary indications, because of schedule pressures. (from the Air Force System Safety Handbook for Acquisition Managers, Air Force Space Division, January 1984)

Leveson’s illustrations from the history of these industries are fascinating.  But even more valuable are the principles of safety engineering that she recapitulates.  These principles seem to have many implications for deep-ocean drilling and associated technologies and systems.  Here is her definition of systems safety:

System safety uses systems theory and systems engineering approaches to prevent foreseeable accidents and to minimize the result of unforeseen ones.  Losses in general, not just human death or injury, are considered. Such losses may include destruction of property, loss of mission, and environmental harm. The primary concern of system safety is the management of hazards: their identification, evaluation, elimination, and control through analysis, design and management procedures.

Here are several fundamental principles of designing safe systems that she discusses:
  • System safety emphasizes building in safety, not adding it on to a completed design.
  • System safety deals with systems as a whole rather than with subsystems or components.
  • System safety takes a larger view of hazards than just failures.
  • System safety emphasizes analysis rather than past experience and standards.
  • System safety emphasizes qualitative rather than quantitative approaches.
  • Recognition of tradeoffs and conflicts.
  • System safety is more than just system engineering.

And here is an important summary observation about the complexity of safe systems:

Safety is an emergent property that arises at the system level when components are operating together. The events leading to an accident may be a complex combination of equipment failure, faulty maintenance, instrumentation and control problems, human actions, and design errors. Reliability analysis considers only the possibility of accidents related to failures; it does not investigate potential damage that could result from successful operation of the individual components.

How do these principles apply to the engineering problem of deep-ocean drilling?  Perhaps the most important implications are these: a safe system needs to be based on careful and comprehensive analysis of the hazards that are inherently involved in the process; it needs to be designed with an eye to handling those hazards safely; and it can’t be done in a piecemeal, “fly-test-fly” fashion.

It would appear that deep-ocean drilling is characterized by too little analysis and too much confidence in the ability of engineers to “correct” inadvertent outcomes (“fly-fix-fly”).  The accident that occurred in the Gulf last month can be analyzed into several parts. First is the explosion and fire that destroyed the drilling rig and led to the tragic loss of life of 11 rig workers. And the second is the uncalculated harms caused by the uncontrolled venting of perhaps a hundred thousand barrels of crude oil to date into the Gulf of Mexico, now threatening the coasts and ecologies of several states.  Shockingly, there is now no high-reliability method for capping the well at a depth of over 5,000 feet; so the harm can continue to worsen for a very extended period of time.

The safety systems on the platform itself will need to be examined in detail. But the bottom line will probably look something like this: the platform is a complex system vulnerable to explosion and fire, and there was always a calculable (though presumably small) probability of catastrophic fire and loss of the ship. This is pretty analogous to the problem of safety in aircraft and other complex electro-mechanical systems. The loss of life in the incident is terrible but confined.  Planes crash and ships sink.

What elevates this accident to a globally important catastrophe is what happened next: destruction of the pipeline leading from the wellhead 5,000 feet below sea level to containers on the surface; and the failure of the shutoff valve system on the ocean floor. These two failures have resulted in unconstrained release of a massive and uncontrollable flow of crude oil into the Gulf and the likelihood of environmental harms that are likely to be greater than the Exxon Valdez.

Oil wells fail on the surface, and they are difficult to control. But there is a well-developed technology that teams of oil fire specialists like Red Adair employ to cap the flow and end the damage. We don’t have anything like this for wells drilled under water at the depth of this incident; this accident is less accessible than objects in space for corrective intervention. So surface well failures conform to a sort of epsilon-delta relationship: an epsilon accident leads to a limited delta harm. This deep-ocean well failure in the Gulf is catastrophically different: the relatively small incident on the surface is resulting in an unbounded and spiraling harm.

So was this a foreseeable hazard? Of course it was. There was always a finite probability of total loss of the platform, leading to destruction of the pipeline. There was also a finite probability of failure of the massive sea-floor emergency shutoff valve. And, critically, it was certainly known that there is no high-reliability fix in the event of failure of the shutoff valve. The effort to use the dome currently being tried by BP is untested and unproven at this great depth. The alternative of drilling a second well to relieve pressure may work; but it will take weeks or months. So essentially, when we reach the end of this failure pathway, we arrive at this conclusion: catastrophic, unbounded failure. If you reach this point in the fault tree, there is almost nothing to be done. And this is a totally irrational outcome to tolerate; how could any engineer or regulatory agency have accepted the circumstances of this activity, given that one possible failure pathway would lead predictably to unbounded harms?

There is one line of thought that might have led to the conclusion that deep ocean drilling is acceptably safe: engineers and policy makers might have optimistically overestimated the reliability of the critical components. If we estimate that the probability of failure of the platform is 1/1000, failure of the pipeline is 1/100, and failure of the emergency shutoff valve is 1/10,000 — then one might say that the probability of the nightmare scenario is vanishingly small: one in a billion. Perhaps one might reason that we can disregard scenarios with this level of likelihood. Reasoning very much like this was involved in the original safety designs of the shuttle (Safeware: System Safety and Computers). But several things are now clear: this disaster was not virtually impossible. In fact, it actually occurred. And second, it seems likely enough that the estimates of component failure are badly understated.

What does this imply about deep ocean drilling? It seems inescapable that the current state of technology does not permit us to take the risk of this kind of total systems failure. Until there is a reliable and reasonably quick technology for capping a deep-ocean well, the small probability of this kind of failure makes the use of the technology entirely unjustifiable. It makes no sense at all to play Russian roulette when the cost of failure is massive and unconstrained ecological damage.

There is another aspect of this disaster that needs to be called out, and that is the issue of regulation. Just as the nuclear industry requires close, rigorous regulation and inspection, so deep-ocean drilling must be rigorously regulated. The stakes are too high to allow the oil industry to regulate itself. And unfortunately there are clear indications of weak regulation in this industry (link).

(Here are links to a couple of earlier posts on safety and technology failure (link, link).)

Patient safety — Canada and France


Patient safety is a key issue in managing and assessing a regional or national health system. There are very sizable variations in patient safety statistics across hospitals, with significantly higher rates of infection and mortality in some institutions than others. Why is this? And what can be done in order to improve the safety performance of low-safety institutions, and to improve the overall safety performance of the hospital environment nationally?

Previous posts have made the point that safety is the net effect of a complex system within a hospital or chemical plant, including institutions, rules, practices, training, supervision, and day-to-day behavior by staff and supervisors (post, post). And experts on hospital safety agree that improvements in safety require careful analysis of patient processes in order to redesign processes so as to make infections, falls, improper medications, and unnecessary mortality less likely. Institutional design and workplace culture have to change if safety performance is to improve consistently and sustainably. (Here is a posting providing a bit more discussion of the institutions of a hospital; post.)

But here is an important question: what are the features of the social and legal environment that will make it most likely that hospital administrators will commit themselves to a thorough-going culture and management of safety? What incentives or constraints need to exist to offset the impulses of cost-cutting and status quo management that threaten to undermine patient safety? What will drive the institutional change in a health system that improving patient safety requires?

Several measures seem clear. One is state regulation of hospitals. This exists in every state; but the effectiveness of regulatory regimes varies widely across context. So understanding the dynamics of regulation and enforcement is a crucial step to improving hospital quality and patient safety. The oversight of rigorous hospital accreditation agencies is another important factor for improvement. For example, the Joint Commission accredits thousands of hospitals in the United States (web page) through dozens of accreditation and certification programs. Patient safety is the highest priority underlying Joint Commission standards of accreditation. So regulation and the formulation of standards are part of the answer. But a particularly important policy tool for improving safety performance is the mandatory collection and publication of safety statistics, so that potential patients can decide between hospitals on the basis of their safety performance. Publicity and transparency are crucial parts of good management behavior; and secrecy is a refuge of poor performance in areas of public concern such as safety, corruption, or rule-setting. (See an earlier post on the relationship between publicity and corruption.)

But here we have a little bit of a conundrum: achieving mandatory publication of safety statistics is politically difficult, because hospitals have a business interest in keeping these data private. So there was a lot of resistance to mandatory reporting of basic patient safety data in the US over the past twenty years. Fortunately, the public interest in having these data readily available has largely prevailed, and hospitals are now required to publish a broader and broader range of data on patient safety, including hospital-induced infection rates, ventilator-induced pneumonias, patient falls, and mortality rates. Here is a useful tool from USA Today that lets the public and the patient gather information about his/her hospital options and how these compare with other hospitals regionally and nationally. This is an effective accountability mechanism that inevitably drives hospitals towards better performance.

Canada has been very active in this area. Here is a website published by the Ontario Ministry of Health and Long-Term Care. The province requires hospitals to report a number of factors that are good indicators of patient safety: several kinds of hospital-born infections; central-line primary bloodstream infection and ventilator-associated pneumonia; surgical-site infection prevention activity; and hospital-standardized mortality ratio. The user can explore the site and find that there are in fact wide variations across hospitals in the province. This is likely to change patient choice; but it also serves as an instant guide for regulatory agencies and local hospital administrators as they attempt to focus attention on poor management practices and institutional arrangements. (It would be helpful for the purpose of comparison if the data could be easily downloaded into a spreadsheet.)

On first principles, it seems likely that any country that has a hospital system in which the safety performance of each hospital is kept secret will also show a wide distribution of patient safety outcomes across institutions, and will have an overall safety record that is much lower than it could be. This is because secrecy gives hospital administrators the ability to conceal the risks their institutions impose on patients through bad practices. So publicity and regular publication of patient safety information seems to be a necessary precondition to maintaining a high-safety hospital system.

But here is the crucial point: many countries continue to permit secrecy when it comes to hospital safety. In particular, this seems to be true in France. It seems that the French medical and hospital system continues to display a very high degree of secrecy and opacity when it comes to patient safety. In fact, anecdotal information about French hospitals suggests a wide range of levels of hospital-born infections in different hospitals. Hospital-born infections (infections nosocomiales) are an important and rising cause of patient illness and morbidity. And there are well-known practices and technologies that substantially reduce the incidence of these infections. But the implementation of these practices requires strong commitment and dedication at the unit level; and this degree of commitment is unlikely to occur in an environment of secrecy.

In fact, I have not been able to discover any of the tools that are now available for measuring patient safety in hospitals in North America in application to hospitals in France. But without this regular reporting, there is no mechanism through which institutions with bad safety performance can be “ratcheted” up into better practices and better safety outcomes. The impression that is given in the French medical system is that the doctors and the medical authorities are sacrosanct; patients are not expected to question their judgment, and the state appears not to require institutions to report and publish fundamental safety information. Patients have very little power and the media so far seem to have paid little attention to the issues of patient safety in French hospitals. This 2007 article in LePoint seems to be a first for France in that it provides quantitative rankings of a large number of hospitals in their treatment of a number of diseases. But it does not provide the kinds of safety information — infections, falls, pneumonias — that are core measures of patient safety.

There is a French state agency, OFFICE NATIONAL D’INDEMNISATION DES ACCIDENTS MÉDICAUX (ONIAM), that provides compensation to patients who can demonstrate that their injuries are the result of hospital-induced causes, including especially hospital-associated infections. But it appears that this agency is restricted to after-the-fact recognition of hospital errors rather than pro-active programs designed to reduce hospital errors. And here is a French government web site devoted to the issue of hospital infections. It announces a multi-pronged strategy for controlling the problem of infections nosocomiales, including the establishment of a national program of surveillance of the rates of these infections. So far, however, I have not been able to locate web resources that would provide hospital-level data about infection rates.

So I am offering a hypothesis that I would be very happy to find to be refuted: that the French medical establishment continues to be bureaucratically administered with very little public exposure of actual performance when it comes to patient safety. And without this system of publicity, it seems very likely that there are wide and tragic variations across French hospitals with regard to patient safety.

Are there French medical sociologists and public health researchers who are working on the issue of patient safety in French hospitals? Can good contemporary French sociologists like Céline Béraud, Baptiste Coulmont, and Philippe Masson offer some guidance on this topic (post)? If readers are aware of databases and patient safety research programs in France that are relevant to these topics, I would be very happy to hear about them.

Update: Baptiste Coulmont (blog) passes on this link to Réseau d’alerte d’investigations et de surveillance des infections nosocomia (RAISIN) within the Institut de veille sanitaire. The site provides research reports and regional assessments of nosocomia incidence. It does not appear to provide data at the level of the specific hospitals and medical centers. Baptiste refers also to work by Jean Peneff, a French medical sociologist and author of La France malade de ses médecins. Here is a link to a subsequent research report by Peneff. Thanks, Baptiste.

Safety as a social effect


Some organizations pose large safety issues for the public because of the technologies and processes they encompass. Industrial factories, chemical and nuclear plants, farms, mines, and aviation all represent sectors where safety issues are critically important because of the inherent risks of the processes they involve. However, “safety” is not primarily a technological characteristic; instead, it is an aggregate outcome that depends as much on the social organization and management of the processes involved as it does on the technologies they employ. (See an earlier posting on technology failure.)

We can define safety by relating it to the concept of “harmful incident”. A harmful incident is an occurrence that leads to injury or death of one or more persons. Safety is a relative concept, in that it involves analysis and comparison of the frequencies of harmful incidents relative to some measure of the volume of activity. If the claim is made that interstate highways are safer than county roads, this amounts to the assertion that there are fewer accidents per vehicle-mile on the former than the latter. If it is held that commercial aviation is safer than automobile transportation, this amounts to the claim that there are fewer harms per passenger-mile in air travel than auto travel. And if it is observed that the computer assembly industry is safer than the mining industry, this can be understood to mean that there are fewer harms per person-day in the one sector than the other. (We might give a parallel analysis of the concept of a healthy workplace.)

This analysis highlights two dimensions of industrial safety: the inherent capacity for creating harms associated with the technology and processes in use (heavy machinery, blasting, and uncertain tunnel stability in mining, in contrast to a computer and a red pencil on the editorial offices of a newspaper), and the processes and systems that are in place to guard against harm. The first set of factors is roughly “technological,” while the second set is social and organizational.

Variations in safety records across industries and across sites within a given industry provide an excellent tool for analyzing the effects of various institutional arrangements. It is often possible to pinpoint a crucial difference in organization — supervision, training, internal procedures, inspection protocols, etc. — that can account for a high accident rate in one factory and a low rate in an otherwise similar factory in a different state.

One of the most important findings of safety engineering is that organization and culture play critical roles in enhancing the safety characteristics of a given activity — that is to say, safety is strongly influenced by social factors that define and organize the behaviors of workers, users, or managers. (See Charles Perrow, Normal Accidents: Living with High-Risk Technologies and Nancy Leveson, Safeware: System Safety and Computers, for a couple of excellent treatments of the sociological dimensions of safety.)

This isn’t to say that only social factors can influence safety performance within an activity or industry. In fact, a central effort by safety engineers involves modifying the technology or process so as to remove the source of harm completely — what we might call “passive” safety. So, for example, if it is possible to design a nuclear reactor in such a way that a loss of coolant leads automatically to shutdown of the fission reaction, then we have designed out of the system the possibility of catastrophic meltdown and escape of radioactive material. This might be called “design for soft landings”.

However, most safety experts agree that the social and organizational characteristics of the dangerous activity are the most common causes of bad safety performance. Poor supervision and inspection of maintenance operations leads to mechanical failures, potentially harming workers or the public. A workplace culture that discourages disclosure of unsafe conditions makes the likelihood of accidental harm much greater. A communications system that permits ambiguous or unclear messages to occur can lead to air crashes and wrong-site surgeries.

This brings us at last to the point of this posting: the observation that safety data in a variety of industries and locations permit us to probe organizational features and their effects with quite a bit of precision. This is a place where institutions and organizations make a big difference in observable outcomes; safety is a consequence of a specific combination of technology, behaviors, and organizational practices. This is a good opportunity for combining comparative and statistical research methods in support of causal inquiry, and it invites us to probe for the social mechanisms that underlie the patterns of high or low safety performance that we discover.

Consider one example. Suppose we are interested in discovering some of the determinants of safety records in deep mining operations. We might approach the question from several points of view.

  • We might select five mines with “best in class” safety records and compare them in detail with five “worst in class” mines. Are there organizational or techology features that distinguish the cases?
  • We might do the large-N version of this study: examine a sample of mines from “best in class” and “worst in class” and test whether there are observed features that explain the differences in safety records. (For example, we may find that 75% of the former group but only 10% of the latter group are subject to frequent unannounced safety inspection. This supports the notion that inspections enhance safety.)
  • We might compare national records for mine safety–say, Poland and Britain. We might then attempt to identify the general characteristics that describe mines in the two countries and attempt to explain observed differences in safety records on the basis of these characteristics. Possible candidates might include degree of regulatory authority, capital investment per mine, workers per mine, …
  • We might form a hypothesis about a factor that should be expected to enhance safety — a company-endorsed safety education program, let’s say — and then randomly assign a group of mines to “treated” and “untreated” groups and compare safety records. (This is a quasi-experiment; see an earlier posting for a discussion of this mode of reasoning.) If we find that the treated group differs significantly in average safety performance, this supports the claim that the treatment is causally relevant to the safety outcome.

Investigations along these lines can establish an empirical basis for judging that one or more organizational features A, B, C have consequences for safety performance. In order to be confident in these judgments, however, we need to supplement the empirical analysis with a theory of the mechanisms through which features like A, B, C influence behavior in such a way as to make accidents more or less likely.

Safety, then, seems to be a good area of investigation for researchers within the general framework of the new institutionalism, because the effects of institutional and organizational differences emerge as observable differences in the rates of accidents in comparable industrial settings. (See Mary Brinton and Victor Nee, The New Institutionalism in Sociology, for a collection of essays on this approach.)

Explaining technology failure


Technology failure is often spectacular and devastating — witness Bhopal, Three Mile Island, Chernobyl, the Challenger disaster, and the DC10 failures of the 1970s. But in addition to being a particularly important cause of human suffering, technology failures are often very complicated social outcomes that involve a number of different kinds of factors. And this makes them interesting topics for social science study.

It is fairly common to attribute spectacular failures to a small number of causes — for example, faulty design, operator error, or a conjunction of unfortunate but singly non-fatal accidents. What sociologists who have studied technology failures have been able to add is the fact that the root causes of disastrous failures can often be traced back to deficiencies of the social organizations in which they are designed, used, or controlled (Charles Perrow, Normal Accidents: Living with High-Risk Technologies). Technology failures are commonly the result of specific social organizational defects; so technology failure is often or usually a social outcome, not simply a technical or mechanical misadventure. (Dietrich Dorner’s The Logic of Failure: Recognizing and Avoiding Error in Complex Situations is a fascinating treatment of a number of cases of failure; Eliot Cohen’s Military Misfortunes: The Anatomy of Failure in War provides an equally interesting treatment of military failures; for example, the American failure to suppress submarine attacks on merchant shipping off the US coast in the early part of World War II.)

First, a few examples. The Challenger space shuttle was destroyed as a result of O-rings in the rocket booster units that became brittle because of the low launch temperature — evidently an example of faulty design. But various observers have asked the more fundamental question: what features of the science-engineering-launch command process that was in place within NASA and between NASA and its aerospace suppliers led it to break down so profoundly (Diane Vaughan, The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA)? What organizational defects made it possible for this extended group of talented scientists and engineers to come to the decision to launch over the specific warnings that were brought forward by the rocket provider’s team about the danger of a cold-temperature launch? Edward Tufte attributes the failure to poor scientific communication (Visual Explanations: Images and Quantities, Evidence and Narrative); Morton Thiokol engineer Roger Boisjoly attributes it to an excessively hierarchical and deferential relation between the engineers and the launch decision-makers. Either way, features of the NASA decision-making process — social-organizational features — played a critical role.

Bhopal represents another important case. Catastrophic failure of a Union Carbide pesticide plant in Bhopal, India in 1984 led to a release of a highly toxic gas. The toxic cloud passed into the densely populated city of Bhopal. Half a million people were affected, and between 16 and 30 thousand people died as a result. A chemical plant is a complex physical system. But even more, it is operated and maintained by a complex social organization, involving training, supervision, and operational assessment and oversight. In his careful case study of Bhopal, Paul Shrivastava maintains that this disaster was caused by a set of persistent and recurring organizational failures, especially in the areas of training and supervision of operators (Bhopal: Anatomy of Crisis).

Close studies of the nuclear disasters at Chernobyl and Three Mile Island have been equally fruitful in terms of shedding light on the characteristics of social, political, and business organization that have played a role in causing these great disasters. The stories are different in the two cases; but in each case, it turns out that social factors, including both organizational features internal to the nuclear plants and political features in the surrounding environment, played a role in the occurrence and eventual degree of destruction associated with the disasters.

These cases illustrate several important points. First, technology failures and disasters almost always involve a crucial social dimension — in the form of the organizations and systems through which the technology is developed, deployed, and maintained and the larger social environment within which the technology is situated. Technology systems are social systems. Second, technology failures therefore constitute an important subject matter for sociological and organizational research. Sociologists can shed light on the ways in which a complex technology might fail. And third, and most importantly, the design of safe systems — particularly systems that have the potential for creating great harms — needs to be an interdisciplinary effort. The perspectives of sociologists and organizational theorists need to be incorporated as deeply as those of industrial and systems engineers into the design of systems that will preserve a high degree of safety. This is an important realization for the high profile risky industries — aviation, chemicals, nuclear power. But it is also fundamental for other important social institutions, including especially hospitals and health systems. Safe technologies will only exist when they are embedded in safe, fault-tolerant organizations and institutions. And all of this means, in turn, that there is an urgent need for a sociology of safety.

%d bloggers like this: