Profiles
Scoped identity per agent
Allowlists
Least-privilege tool access
Vaults
Bound secrets, not env vars
Sandboxes
Docker-first execution
Policy Gates
Human approval in the loop
Telemetry
Forensic replay & export
Six primitives. Each one composable — you can adopt them incrementally.
Profiles: every agent gets a named identity with bounded permissions.
Allowlists: the only tools the agent can call are the ones you explicitly permit.
Vaults: secrets are injected at runtime, scoped per agent, never in environment variables.
Sandboxes: code execution happens in Docker — isolated, reproducible.
Policy gates: operators define when a human must approve before the agent proceeds.
Telemetry: every call, every decision, captured for replay and forensic audit.
Act 2 · Colosseum
Policy & approvals — the human gate
// Example: require approval for any shell.exec that writes outside the sandbox
policy.Rule{
Tool: "shell.exec",
Match: func(a Args) bool { return writesOutside(a.Cwd, a.Cmd) },
Action: policy.RequireApproval,
}
The agent pauses, a notification goes to the UI, a human approves or rejects — with a reason, captured in the audit log.
Policy is Go code you can review and test, not prompt text you have to trust. The human
approval loop is first-class in the UI — the presenter will show this in the demo.
Act 2 · Colosseum
Telemetry — the run is the evidence
Runs
Steps
Tool calls
Events
Spans
Artifacts
counter to
"I can't prove what the agent did." You can. It's in SQLite. It's queryable. It's exportable.
This is the part security teams care about most. The execution graph is what turns an agent
incident from a mystery into a ticket.
Act 2 · Colosseum
Replay & export — reproducible forensics
The export bundle is your incident write-up's attachment.
When something goes wrong, you don't ask the vendor for their side of the story. You have
the full trace, the tool responses, the policy decisions, and you can rerun them against
a fixed agent to prove the fix.
Act 3 · So what
Live Demo
DEMO SCRIPT — keep this window open.
1. Open Agents → support-triage-bot. Show the sectioned config: identity, prompts,
tools, credentials, planning, contract.
2. Toggle planning_mode = required. Save.
3. Set output_contract = schemas/triage-v1.json. Save.
4. Kick off a run with a sample ticket. Narrate: "This plan is the first thing a human sees."
5. Approve the plan. Watch the telemetry stream into Run Detail live.
6. Hit the shell.exec approval gate. Approve with a reason. Show the approval in the audit log.
7. Show the artifacts tab — the JSON output matching the contract.
8. Open the run graph. Walk the step tree. Point at the recorded tool calls.
9. Replay the same run with a narrower allowlist. Show that the denied call is visible as an event.
If anything goes sideways: append ?demo=video to the URL and the fallback recording takes over.
Act 3 · So what
Confident deployment
Ship the harness, not the hope.
The eval industry is booming. Good. But evals are a pre-deployment discipline. Once the
agent is live, what keeps you safe is the runtime: allowlists, approvals, contracts, audit.
Act 3 · So what
Incident response, rebuilt for agents
What did it do?
Who authorized it?
Can it happen again?
Who else is affected?
The execution graph is the audit trail.
Frame this in IR language. MTTR for agent incidents today is effectively infinite because
the evidence doesn't exist. Once it does, this is just a debugging exercise.
Act 3 · So what
Colosseum vs. "just the SDK"
Raw SDK
Hosted agent product
Colosseum
Tool allowlist per agent — partial yes
Scoped credential vaults env vars vendor yes
Human-in-loop approvals DIY limited yes
Output contracts DIY partial yes
Full telemetry & replay — vendor-locked yes
Runs on your infra yes — yes
Single binary, no cluster yes — yes
Keep this even-handed. The SDK path is legitimate for prototypes. Hosted products are fine
if you accept the lock-in. Colosseum occupies the "self-hosted control plane" slot that's
otherwise empty.
Closing
Open source — all the way down
MIT licensed.
Single Go binary
Embedded UI
Bring your own provider
Bring your own tools
ship in 5 minutes
This is the call to action for builders in the room. No SaaS signup, no waitlist. Clone,
build, run.
Closing
Thank you
Colosseum is MIT-licensed and ready to run.
Questions? Catch me after — or open an issue.
Wrap. Invite questions. If Q&A is scheduled, transition explicitly. Thank the
Secure360 program committee.
s360-26.0x509.com
Appendix
Reference slides · not presented
Act 2 · Colosseum
LLM for synthesis. Deterministic code for guarantees.
Let the model do what only it can do — reason, summarize, suggest.
Keep enforcement, gating, and audit in code you can read.
Every control in Colosseum is proactive : defined, versioned, and enforced
before the agent takes a step — not inferred from its exhaust afterwards.
This is the design rule for the whole project. We do not ask the model to enforce its own
allowlist. We do not ask a second model to watch the first one and flag misbehavior. Those
are deterministic properties; they are enforced by code that runs around the model, before
the tool call leaves the process. That's what makes this architecture rather than automation.
Act 2 · Colosseum
Agent profiles — scoped identity
name: support-triage-bot
model: claude-sonnet-4-6
provider_config: anthropic-prod
system_prompt: |
You are a support triage agent. You may read Zendesk and
the internal KB. You may not write to either.
allowed_tools: [web.fetch, files.read, json.parse]
credential_vault: zendesk-readonly
planning_mode: required
output_contract: schemas/triage-v1.json
One YAML file. One reviewable artifact. Version-controlled like any other deploy.
Counter to "credential sprawl per script." Every agent has a known identity. When you look
at a run, you know exactly which agent ran, on which config, with which tools and
credentials.
Act 2 · Colosseum
Tool allowlist — least privilege, per agent
shell.exec
files.read · files.write
web.fetch · browser.navigate
http.request — with URL allowlist
subagent.invoke — explicit agent-calls-agent
counter to
Tool overreach. If a tool is not in the allowlist, it is not callable — enforced in the step loop, not in the prompt.
The allowlist is checked on every tool call attempt. Denied calls are recorded as events
so you can see what the agent tried to do , not just what it succeeded at.
Act 2 · Colosseum
Credential vaults — scoped secrets, bound per agent
vaults:
- name: zendesk-readonly
entries:
ZENDESK_TOKEN: ${from: doppler, key: ZD_READ_ONLY}
- name: prod-deploy
entries:
AWS_SECRET_ACCESS_KEY: ${from: vault, path: secret/aws/prod}
agents:
- name: support-triage-bot
credential_vault: zendesk-readonly # can't touch prod-deploy
Injected into tool calls at the last possible moment. Redacted from logs and UI by default.
Counter to env-var sprawl and audit-trail gaps. The binding is auditable; the values are
not plaintext at rest; rotation happens in one place.
Act 2 · Colosseum
Environments — Docker-first sandboxes
Session lifetime bounded — no zombie containers
counter to
"It ran rm -rf on my laptop." Blast radius is whatever the container can reach — nothing more.
The shell tool is the sharpest knife in the drawer. Environments are how you keep it in a
drawer. You can still run host-mode for trusted internal agents, but the default is
sandboxed.
Act 2 · Colosseum
Output contracts — "looks good" isn't good enough
Contract violations are a terminal error, not a warning
counter to
"The answer looked right, but it was wrong." The contract turns a qualitative check into a deterministic one.
This is what lets you call an agent from a normal piece of code and actually trust the
return value. Without a contract you're doing string parsing on hallucinated JSON.
Act 2 · Colosseum
Planning mode — force a reviewable plan
Three modes, per agent:
The plan is stored with the run. You can replay the run and show that the plan was followed.
Counter to opaque autonomy. For high-stakes agents — ones that touch prod or money — you
want "required." The plan is both a safety control and a debugging artifact.
Closing
What's next
scale
multi-tenant
policy
sandboxes
integrations
tools
Brief; don't linger. Make clear these are directions, not dated commitments. Point folks
at the GitHub roadmap for specifics.