Pattern 9: Error and Recovery Patterns
Overview
Coordination structures contain mechanisms for handling deviations from expected or desired system states. These mechanisms may detect errors immediately or with delay, respond through predefined procedures or improvisation, and contain effects locally or allow propagation across system boundaries.
Recovery approaches may focus on attribution and accountability, or on system restoration and learning. Response capabilities may be practiced and tested before need, or developed during actual disruptions. The presence and characteristics of error handling structures affect system resilience and disruption magnitude.
These structural features appear where work involves complexity, interdependence, or operating conditions that create deviation possibilities—in stable operations, during change or high load, and when executing novel work.
Observable Manifestations
Small deviations or errors amplifying into larger system disruptions
Absence of defined procedures specifying actions when disruptions occur
Organizational responses to errors focusing on individual attribution rather than system restoration
Local corrective actions creating unanticipated problems in other system parts
Heightened urgency and stress responses when disruptions occur
Errors or deviations not being reported or communicated to relevant actors
Recovery procedures not tested or practiced before actual need
System coupling characteristics allowing rapid propagation of disruptions across boundaries
Buffer or redundancy capacity being absent when errors occur
Error responses improvised in the moment rather than following established structures
Structural Conditions
Work complexity creating possibilities for deviations from expected states
System interdependencies allowing local disruptions to affect other components
Detection mechanisms capable of identifying deviations from desired states
Communication channels through which error information can be transmitted
Authority structures enabling response activation when errors are detected
Cultural norms regarding error surfacing, attribution, and organizational learning
Reserve capacity or buffers capable of absorbing error effects
Organizational memory of past errors and recovery experiences
Boundaries
Not about individual competence or care in work execution
Not implying poor quality, carelessness, or organizational dysfunction
Not explaining why specific error and recovery structures exist in particular contexts
Not evaluating whether particular error structures are appropriate for contexts
Not addressing optimal error tolerance levels for specific situations
Not distinguishing necessary from unnecessary recovery mechanisms
Common Misattributions
Attributed to individual carelessness or incompetence when error detection mechanisms are structurally absent
Attributed to poor training when recovery protocols have not been defined or practiced
Attributed to quality control failure when system coupling creates unavoidable propagation
Attributed to blame culture when organizational structures incentivize error hiding
Attributed to lack of planning when buffer capacity is structurally unavailable
Attributed to individual panic when predefined response procedures do not exist
Attributed to coordination failure when local fixes create downstream effects in complex systems
The presence of this pattern does not imply poor quality control, careless execution, or required change. It describes observable error and recovery structures that exist across many functional and successful organizations. Both explicit and implicit error handling approaches persist in different organizational contexts for context-specific structural reasons.