Your model can reason. It can plan. It can execute. The moment it acts in the real world, every team hits the same wall — and it’s not a capability problem:
“Who approved this exact action?”
Every agent platform eventually has to answer for what its agent did — to a board, a regulator, an insurer, a court. “The model did it” is not an answer. And when prompt injection turns a helpful assistant into a financial weapon, the headline names you, not the user.
All shipping autonomous action. All hitting the same wall.
A named human cryptographically assumes responsibility for the exact action — not a role, not a token, a person. This is the answer to "who owns this decision?"
A signed, offline-verifiable record of the decision: action, policy, approver, outcome. Anyone verifies it with no account and no call home (Ed25519 + Merkle).
The exact policy version that authorized the action, pinned into the receipt. The rules that applied are provable after the fact, not reconstructed.
The delegation path — who was allowed to authorize whom — bound to the action. Permission isn’t assumed; it’s carried and checked.
Four concepts. Nothing else. Formally verified (26 TLA+ theorems), Apache-2.0, no vendor lock-in.
EMILIA ships as an MCP server, so any MCP-capable client — Claude, GPT, Gemini, Cursor, Windsurf — can experiment with accountable actions in one line. No partnership meeting required.
npx -y @emilia-protocol/mcp-server
Or guard an existing OpenAI-compatible agent — OpenAI, xAI Grok, Together — so every irreversible tool call routes through EMILIA before it runs:
npm install @emilia-protocol/openai-guard
We pointed four frontier models — OpenAI’s gpt-4o-mini, xAI’s grok-4, Anthropic’s Claude, and Google’s Gemini — at six high-stakes requests (large wires, a “CFO says skip approval” injection, a payout-bank change) as an autonomous treasury agent, and scored the same model output with and without EMILIA:
The EMILIA result is deterministic — the verified engine gates every ≥$50k release and bank-destination change, every run.
Don’t take our word for it — the harness is open. Reproduce it, or point it at your own model:
BENCH_API_KEY=sk-... node bench/run.mjs