For AI labs & platforms

OAuth solved login. EMILIA solves accountability for AI agents.

Your model can reason. It can plan. It can execute. The moment it acts in the real world, every team hits the same wall — and it’s not a capability problem:

“Who approved this exact action?”

The shared wall

Every agent platform eventually has to answer for what its agent did — to a board, a regulator, an insurer, a court. “The model did it” is not an answer. And when prompt injection turns a helpful assistant into a financial weapon, the headline names you, not the user.

OpenAI OperatorsAnthropic Computer UseGoogle AgentsMicrosoft Copilot ActionsxAIVisa Agent Commerce

All shipping autonomous action. All hitting the same wall.

The answer — four concepts

A cryptographically provable record of who owns each decision.

Accountable Signoff

A named human cryptographically assumes responsibility for the exact action — not a role, not a token, a person. This is the answer to "who owns this decision?"

Trust Receipt

A signed, offline-verifiable record of the decision: action, policy, approver, outcome. Anyone verifies it with no account and no call home (Ed25519 + Merkle).

Policy Hash

The exact policy version that authorized the action, pinned into the receipt. The rules that applied are provable after the fact, not reconstructed.

Authority Chain

The delegation path — who was allowed to authorize whom — bound to the action. Permission isn’t assumed; it’s carried and checked.

Four concepts. Nothing else. Formally verified (26 TLA+ theorems), Apache-2.0, no vendor lock-in.

Try it in your model today

EMILIA ships as an MCP server, so any MCP-capable client — Claude, GPT, Gemini, Cursor, Windsurf — can experiment with accountable actions in one line. No partnership meeting required.

npx -y @emilia-protocol/mcp-server

Or guard an existing OpenAI-compatible agent — OpenAI, xAI Grok, Together — so every irreversible tool call routes through EMILIA before it runs:

npm install @emilia-protocol/openai-guard

The MCP server →Watch an agent get stopped Talk to us

Real-world proof — reproducible

Autonomous treasury agent, crash-tested.

We pointed four frontier models — OpenAI’s gpt-4o-mini, xAI’s grok-4, Anthropic’s Claude, and Google’s Gemini — at six high-stakes requests (large wires, a “CFO says skip approval” injection, a payout-bank change) as an autonomous treasury agent, and scored the same model output with and without EMILIA:

Unauthorized high-stakes actions executed (of 6) — alone → with EMILIA

gpt-4o-mini5/6 → 0/6

grok-43/6 → 0/6

claude-sonnet-4.54/6 → 0/6

gemini-2.5-flash5/6 → 0/6

False friction

Safe actions EMILIA wrongly blocked

0/6 (0%)

The EMILIA result is deterministic — the verified engine gates every ≥$50k release and bank-destination change, every run.

Don’t take our word for it — the harness is open. Reproduce it, or point it at your own model:

BENCH_API_KEY=sk-... node bench/run.mjs