Skip to main content

Overview

The Report page provides an aggregate view of your agent’s test results across all rings and simulation runs. Use it to understand your agent’s overall reliability posture and identify areas that need improvement.

What’s included

Ring-level summary

See pass/fail rates broken down by ring. Quickly identify which dimensions of reliability your agent struggles with:
  • High pass rate on Ring 1 but low on Ring 5? Your agent handles happy paths but breaks with interruptions.
  • Ring 2 failures? Policy enforcement needs work.
  • Ring 6 failures? The STT pipeline struggles with noise.

Job summary

Aggregate statistics across all simulation jobs:
  • Total jobs run
  • Pass / fail / pending breakdown
  • Average call duration
  • Average latency

Using reports for iteration

Reports help you prioritize improvements:
  1. Run Rings 0-2 first — establish basic correctness
  2. Review the report — identify failing rings
  3. Fix the root cause — update call flow, policies, or agent prompt
  4. Re-run failing rings — verify the fix
  5. Progress to higher rings — test robustness and stress
The report updates automatically as new simulation results come in.