Agent Performance Intelligence
Deterministic comparison of AI coding agents. No vibes — just math. Every metric is grounded in Intent-to-Outcome linkage.
Claude Code is the top-performing agent with 89% first-pass success across 54 sessions, contributing 55% of successful executions.
Cursor generates 44% of risk events but only 25% of successful executions. Risk-to-productivity ratio: 1.8x.
Claude Code leads knowledge growth with 42% of promoted patterns — building reusable institutional memory.
Claude Code has the lowest rework rate at 7%, indicating the most stable first-pass behavior.
Agent Scorecard
| Agent | First-Pass Success | Rework | Risk | Productivity | Patterns | Sessions | Attribution |
|---|---|---|---|---|---|---|---|
Claude Code TOP | 89% | 7% | 32% | 55% | 42% | 54 | DETERMINISTIC |
Cursor RISK | 76% | 14% | 44% | 25% | 33% | 29 | DETERMINISTIC |
Codex | 80% | 20% | 24% | 5% | 25% | 5 | PARTIAL |
Proof of Capability
Every metric above is tagged with its attribution quality. This isn't AI guesswork — it's auditable fact derived from Intent-to-Outcome linkage.
Agent explicitly identified. Full provenance chain from task to outcome.
Agent inferred from environment signals. Confidence above 50%.
Attribution quality below threshold. Data shown but flagged as low-confidence.