Featured Study
Edwin Ong & Alex Vikati · feb-2026 · claude-code v2.1.39
What Claude Code Actually Chooses
We pointed Claude Code at real repos 2,430 times and watched what it chose. No tool names in any prompt. Open-ended questions only.
3 models · 4 project types · 20 tool categories · 85.3% extraction rate
Update: Sonnet 4.6 was released on Feb 17, 2026. We'll run the benchmark against it and update results soon.
The big finding: Claude Code builds, not buys. Custom/DIY is the most common single label extracted, appearing in 12 of 20 categories (though it spans categories while individual tools are category-specific). When asked “add feature flags,” it builds a config system with env vars and percentage-based rollout instead of recommending LaunchDarkly. When asked “add auth” in Python, it writes JWT + bcrypt from scratch. When it does pick a tool, it picks decisively: GitHub Actions 94%, Stripe 91%, shadcn/ui 90%.
Headline Findings
In 12 of 20 categories, Claude Code builds custom solutions rather than recommending tools. 252 total Custom/DIY picks, more than any individual tool. E.g., feature flags via config files + env vars, Python auth via JWT + passlib, caching via in-memory TTL wrappers.
When Claude Code picks a tool, it shapes what a large and growing number of apps get built with. These are the tools it recommends by default:
Mostly JS-ecosystem. See report for per-ecosystem breakdowns.
Redis 93% (Python caching), Prisma 79% (JS ORM), Celery 100% (Python jobs). Picks established tools.
Most likely to name a specific tool (86.7%). Distributes picks most evenly across alternatives.
Drizzle 100% (JS ORM), Inngest 50% (JS jobs), 0 Prisma picks in JS. Builds custom the most (11.4% — e.g., hand-rolled auth, in-memory caches).
What Claude Code favors. Not market adoption data.
Tool Leaderboard→
Top 10 by primary pick count across all responses
Against the Grain→
Tools with large market share that Claude Code barely touches, and sharp generational shifts between models.
The Recency Gradient
Newer models tend to pick newer tools. Within-ecosystem percentages shown. Each card tracks the two main tools in a race; remaining picks go to Custom/DIY or other tools.
Replaced by: FastAPI BackgroundTasks (0% → 44%), rest Custom/DIY or non-extraction
Within Python job picks only (61% extraction rate). Custom/DIY = asyncio tasks, no external queue
Replaced by: Custom/DIY (0% → 50%), rest other tools
Within Python caching picks only
The Deployment Split
Deployment is fully stack-determined: Vercel for JS, Railway for Python. Traditional cloud providers got zero primary picks.
Zero primary picks across all 112 deployment responses:
Never the primary choice, but some are frequently recommended as alternatives.
Frequently recommended as alternatives
Mentioned but never recommended (0 alt picks)
Example: "Where should I deploy this?" (Next.js SaaS, Opus 4.5)
Vercel (Recommended) — Built by the creators of Next.js. Zero-config deployment, automatic preview deployments, edge functions. vercel deploy
Netlify — Great alternative with similar features. Good free tier.
AWS Amplify — Good if you're already in the AWS ecosystem.
Vercel gets install commands and reasoning. AWS Amplify gets a one-liner.
Truly invisible (rarely even mentioned)
Where Models Disagree→
All three models agree in 18 of 20 categories within each ecosystem. These 5 categories have genuine within-ecosystem shifts or cross-language disagreement.
| Category | Sonnet 4.5 | Opus 4.5 | Opus 4.6 |
|---|---|---|---|
| ORM (JS)JSNext.js project. The strongest recency shift in the dataset. | Prisma79% | Drizzle60% | Drizzle100% |
| Jobs (JS)JSNext.js project. BullMQ → Inngest shift in newest model. | BullMQ50% | BullMQ56% | Inngest50% |
| Jobs (Python)PythonPython API project (61% extraction rate). Celery collapses in newer models. | Celery100% | FastAPI BgTasks38% | FastAPI BgTasks44% |
| CachingCross-languageCross-language (Redis and Custom/DIY appear in both JS and Python) | Redis71% | Redis31% | Custom/DIY32% |
| Real-timeCross-languageCross-language (SSE, Socket.IO, and Custom/DIY appear across stacks) | SSE23% | Custom/DIY19% | Custom/DIY20% |
Dig into the data
Category deep-dives, phrasing stability analysis, cross-repo consistency data, and market implications.