Coding agent intelligence
When AI coding agents build,
what do they choose and why?
AI coding agents are the new distribution channel for dev tools. Amplifying runs Claude Code, Codex, and Cursor against real codebases and tracks what they choose, why they choose it, and how it shifts across models.
See all researchPublished
Research
The Tools OpenAI Agreed to Buy
OpenAI announced plans to acquire Astral (Ruff, uv). Both Claude Code and Codex agree: Astral tools capture 75% of all Python tooling picks.
What Codex Actually Chooses (vs Claude Code)
Same prompts, two flagship agents, different tool picks. Ownership-linked gaps, platform leans, and a universal build-it-yourself default.
What Claude Code Actually Chooses
We pointed Claude Code at real repos 2,430 times and watched what it chose. Custom/DIY is the #1 recommendation in 12 of 20 categories.
Why AI Product Recommendations Keep Changing
We asked Google AI Mode and ChatGPT 792 product questions. The results reveal 47% cross-platform disagreement, Shopping Graph bias, and significant output drift.
in-progress
Upcoming Benchmarks
Same methodology — open-ended prompts, real repos, multiple models.
Security Defaults
SoonDoes Claude Code build secure apps by default? We audit output for OWASP Top 10 vulnerabilities, input validation, secrets management, rate limiting, CORS, and CSP headers.
Dependency Footprint
SoonFor the same task, how many packages does each model install? Total node_modules size? Pinned vs floating? Maps the dependency sprawl of AI-generated apps.
Get notified when new benchmarks drop.
Explore the research
Thousands of real agent decisions tracked across every major coding agent and model release. See which tools win by default.