AI benchmarking research studio

Amplifying measures the
subjective choices AI makes

AI systems make opinionated decisions every time they run — what tools to install, what products to recommend, what to build from scratch. We benchmark these choices systematically so the patterns become visible.

See all research

Research

feb-2026Featured

What Claude Code Actually Chooses

We pointed Claude Code at real repos 2,430 times and watched what it chose. Custom/DIY is the #1 recommendation in 12 of 20 categories.

2,430responses across 3 models, 4 repos, 20 categories

View study

may-2025

Why AI Product Recommendations Keep Changing

We asked Google AI Mode and ChatGPT 792 product questions. The results reveal 47% cross-platform disagreement, Shopping Graph bias, and significant output drift.

792product questions across 2 platforms

View study

in-progress

Upcoming Benchmarks

Same methodology — open-ended prompts, real repos, multiple models.

Security Defaults

Soon

Does Claude Code build secure apps by default? We audit output for OWASP Top 10 vulnerabilities, input validation, secrets management, rate limiting, CORS, and CSP headers.

Dependency Footprint

Soon

For the same task, how many packages does each model install? Total node_modules size? Pinned vs floating? Maps the dependency sprawl of AI-generated apps.

Get notified when new benchmarks drop.

Explore the research

Systematic analysis of how AI systems make subjective decisions — from developer tool choices to product recommendations.

All Research Claude Code Picks View Raw Data

Amplifying measures the subjective choices AI makes