Skip to main content

Projects

A project is the top-level container in SuperBryn. Each project has an associated industry (Healthcare, Finance, Insurance, etc.) and contains one or more agents. Use projects to organize agents by product line, team, or use case.

Agents

An agent represents the voice AI you’re testing. Each agent has:
FieldDescription
NameDisplay name for the agent
Phone numberThe number to call (E.164 format)
Typeinbound (agent receives calls) or outbound (agent makes calls)
Language & GenderUsed for voice matching in simulations
ProviderThe platform hosting your agent (Vapi, Retell, LiveKit, etc.)
Call flowA directed graph representing the conversation structure
Policy & GuardrailsRules and constraints the agent must follow
Ring configWhich rings are enabled for testing

Call Flows

A call flow is a directed graph (JSON) that models your agent’s conversation structure. It consists of:
  • START nodes — entry points
  • PROCESS nodes — actions the agent takes (greeting, collecting info, etc.)
  • DECISION nodes — branching points based on user input
  • END nodes — call termination points
Edges between nodes are labeled with the conditions or transitions that connect them. You can build flows visually in the Flow Editor, import from JSON, or auto-generate from a text description.

Paths

A path is a specific traversal through the call flow graph — for example, “happy path where user places an order successfully” vs. “error path where user cancels mid-flow.” Scenarios are generated per-path, ensuring coverage across all branches.

The Ring System

SuperBryn’s core testing framework. Nine concentric rings of increasing difficulty, each targeting a different dimension of voice agent quality:
RingNameTests
0Setup & WiringStructural integrity and configuration
1Does the JobFunctional correctness under expected inputs
2Plays by the RulesPolicy and compliance adherence
3Hard to TrickAdversarial robustness (jailbreaks, prompt injection)
4Speech VariationsAccents, slang, typos, ASR errors
5Handles Real PeopleInterruptions, emotion, timing, disfluency
6Works in Real WorldBackground noise, bad networks
7Handles ChaosMultiple failure modes combined
8Stays Good Over TimeRegression and temporal stability
See The Ring System for detailed documentation on each ring.

Scenarios

A scenario is an individual test case. Each scenario contains:
FieldDescription
Ring IDWhich ring this scenario tests
IntentWhat the simulated caller wants (from the agent’s perspective)
User PerspectiveHow the caller describes their situation (second person)
Expected OutcomeStep-by-step expected agent behavior
PathWhich call flow path this scenario exercises
Variant Type(Ring 2) Which policy rule is being tested
Behavior ModifiersOptional adjustments: language, interruption timing, noise, etc.
Scenarios can be generated by AI or created manually.

Evaluators

An evaluator is a container that groups scenarios for a given agent. Each agent has a default evaluator. When you generate or create scenarios, they’re stored under the agent’s evaluator.

Simulations

A simulation run is a single execution of a scenario — one phone call between SuperBryn’s AI caller and your voice agent. Each run tracks:
  • Status: pending → running → completed / failed
  • Recording: Full audio of the call
  • Transcript: Turn-by-turn conversation text
  • Metrics: Latency, interruptions, silence, words per minute, speaking ratios
  • Evaluation: LLM-powered step-by-step pass/fail against the expected outcome

Batch Requests

When you click Run Calls, SuperBryn creates a batch request that expands your selected scenarios × runs-per-scenario into individual simulation jobs. Jobs are processed concurrently through a queue system.

Knowledge Base

Knowledge base documents are reference materials you upload for your agent — product documentation, FAQs, menu data, policy documents. These are injected into the scenario generation prompts so the AI generates contextually accurate test cases.

Agent Versions

SuperBryn tracks agent configuration changes through versions. When you make material changes (call flow, policies, LLM provider, etc.), you can fork a new version. This creates an immutable snapshot, enabling you to compare test results across configuration changes over time.