Playground

Try the AI on real or made-up messages without sending anything. See drafts, sources, confidence, and where coverage is thin.

By ChristopherUpdated May 13, 20263 min read

Playground

The Playground is a sandbox for the AI. You paste a message, pick a model and channel, and see what the AI would draft. Nothing gets sent. No customer is involved.

It is the right place to test voice changes, KB updates, model swaps, and confidence floors before any of them affect your inbox.

What you can do

Three things:

Draft a sample message. Type or paste a question. Get back the draft, the cited articles, the confidence score, the tokens, and the cost.
Replay a real conversation. Pull any past conversation in (you can pick from your inbox) and re-run the AI with current settings. Useful after voice or KB changes.
Find coverage gaps. Browse a list of past tickets where the AI stayed silent due to low confidence. Those are the topics your brain is thin on.

Sample message flow

Open AI → Playground.
Pick a channel (this drives voice, length, and confidence floor).
Pick a model (defaults to your channel's default).
Paste the customer message.
Click Generate.

You see:

The draft.
Confidence score for the reply.
Articles cited and the snippets the AI used.
Tokens in, tokens out, dollar cost.
Predicted action: would auto-send, would draft, would suggest, would stay silent.

The result is identical to what would happen on a real ticket with these settings.

Replay flow

Click Replay.
Pick a past conversation.
The AI re-runs against the current settings.
Compare the new draft to what was actually sent.

Use this when:

You changed Voice and tone. Replay 10 past tickets to see if voice now matches.
You added articles to The brain: KB graph. Replay tickets that were drafted without good citations.
You switched models in Choosing a model. Replay 20 tickets and eyeball the diff.

Coverage gaps

The coverage view lists past tickets where the AI confidence fell below your silence floor. They are the questions your KB does not have a good answer to.

Each row shows:

The customer's message.
The reason for low confidence (no relevant articles, ambiguous topic, conflicting articles).
Suggested actions (write an article, merge duplicates, raise the silence floor).

This is the most useful page in Ochre for tightening AI coverage. Read it weekly for the first month.

Spend in the Playground

Playground requests cost the same as production requests. Tokens go to your provider. Spend is tracked in the receipts feed but does not count against your Spend caps and alerts.

This is intentional. The cap protects your inbox. The Playground is for tuning. They live in different buckets.

Who can use the Playground

Owners, Admins, and Agents. Light agents do not have access.

There is no rate limit beyond your provider's account limits.

Saving Playground sessions

Each Playground run is logged. You can pin a run, share it with the team, or attach it to an internal note on a ticket. Useful for "this is what the AI would say if we changed X".

Limitations

Playground does not simulate routing rules. It tests drafting only.
Playground does not run downstream workflows triggered by Auto-labeling.
Playground does not test Quality assurance review flows.

For end-to-end testing, the right move is to enable the AI in draft mode on a low-volume channel and watch for a day.

Common patterns

Pre-launch checklist. Run 20 sample tickets through the Playground before flipping Autopilot mode on.
KB review. When you publish a new article, replay the three tickets that should have been answered by it.
Voice review. After every voice change, replay 10 tickets to confirm the tilt.
Coverage sprint. Once a week, write articles for the top five coverage gaps.

Was this article helpful?

← Back to Ochre Help

Playground

What you can do

Sample message flow

Replay flow

Coverage gaps

Spend in the Playground

Who can use the Playground

Saving Playground sessions

Limitations

Common patterns

Related