What it tests
Ring 8 answers: Does behavior remain correct across updates, drift, and retrains? Voice agents degrade over time. Model updates change behavior. Prompt tweaks have unintended side effects. Provider changes introduce subtle differences. Ring 8 is the regression safety net — it re-runs tests to catch drift before it reaches production.Prerequisites
None.How it works
Ring 8 enables continuous evaluation:- Baseline capture: After your agent passes Rings 0-7, those results become the behavioral baseline
- Re-run on change: When you update your agent (new model, prompt changes, provider switch), Ring 8 re-runs the same scenarios
- Diff analysis: Results are compared against the baseline to detect behavioral changes — both regressions (things that broke) and improvements
When to use it
- After updating the underlying LLM model
- After changing prompts or system instructions
- After switching voice providers (STT/TTS)
- After modifying call flows or policies
- On a regular schedule to catch gradual drift

