As systems grow increasingly complex, it becomes impossible to identify or test for every possible cause of failure, writes Guest Columnist Irving Wladawsky-Berger.
There is a continuing struggle between complexity and robustness in both evolution and human design. A kind of survival imperative, whether in biology or engineering, requires that simple, fragile systems become more robust. But the mechanisms to increase robustness will in turn make the system considerably more complex. Furthermore, that additional complexity brings its own unanticipated failure modes, which are corrected over time with additional robust mechanisms, which then further add to the complexity of the system, and so on. This balancing act between complexity and robustness is never done.
The classic approaches to safety assumed that accidents are caused by component failures or by human error. Introducing fault tolerance techniques and planning for their failure will help prevent accidents, thus making components very reliable. Similarly rewarding safe human behavior and punishing unsafe behavior will eliminate or significantly reduce accidents.
These assumptions no longer apply, especially for complex, sociotechnical systems–that is, systems that combine powerful digital technologies with the people and organizations that use and support them.