AI & Alignment
Working as Designed: Diagnosing Alignment Failure Without Error
In production (complete)A diagnosis of structural failure where every component works properly and yet the overall regime becomes misaligned.
Function in corpus
Central diagnostic paper for the applied half of the corpus. Translates salience and governance arguments into a single form of failure that became a major public-facing theme.
Details
The paper's key move is to detach failure from component malfunction. A system can satisfy its objectives, follow its procedures, and produce efficient outputs while still reconfiguring responsibility and consequence in ways that make the larger regime fail. This is why 'working as designed' can be an alignment failure rather than a rebuttal to one. ⢠Connected papers: Salience Misalignment; Consequence-Path Audits; Structural Compatibility; What the System Cannot See
Availability
This paper is listed for orientation and dependency tracking. No public PDF or Zenodo record is linked yet.