Ring 7 -- Handles Chaos

What it tests

Ring 7 answers: What breaks when multiple failure modes interact at once? Individual rings test isolated dimensions. But in production, problems compound: a frustrated caller with a heavy accent calls from a noisy street and tries to change their order mid-flow. Ring 7 stacks failure modes from Rings 1-6 to find these compound failures.

Prerequisites

None - but Ring 7 is most valuable after Rings 1-6 have been run.

How it works

Ring 7 is automatically triggered after simulation batches for Rings 1-6 complete for an agent:

The system analyzes results from Rings 1-6 to identify failure patterns
It generates compound scenarios that stack multiple failure modes (e.g., accent + noise + interruption + policy edge case)
These scenarios are automatically simulated

Example compound scenarios

Accented speech (Ring 4) + background noise (Ring 6) + frustrated tone (Ring 5)
Policy edge case (Ring 2) + prompt injection attempt (Ring 3) + interruption (Ring 5)
Off-topic digression (Ring 5) + poor network (Ring 6) + flow branch change (Ring 1)

What it catches

Failures that only appear when multiple stressors interact
Pipeline cascades where STT errors compound into LLM misinterpretation
Edge cases where individually passing conditions combine into failure
Performance degradation under compound stress

Why auto-trigger?

Manually composing compound scenarios is combinatorially expensive and requires knowing what to combine. By auto-triggering after Rings 1-6 and using their results, Ring 7 intelligently selects the most likely compound failure modes.

Ring 6 -- Works in Real World Ring 8 -- Stays Good Over Time

⌘I

Documentation Index

​What it tests

​Prerequisites

​How it works

​Example compound scenarios

​What it catches

​Why auto-trigger?