Amplifying/ai-benchmarks

Sentry

Application monitoring and error tracking

JS/TSPython
#7
Overall Rank
of 20 tools
63.1%
Pick Rate
101 of 160 (CI: 55.4–70.2%)
101
Primary Picks
of 160 extractable
13
Alt Picks
also mentioned 11x
Strong Default
Category Tier
63.1% winner dominance

In Observability

Full comparison →
Sentry101/160 (63.1%) CI: 55.4–70.2%
Custom/DIY35/160 (21.9%) CI: 16.2–28.9%
Prometheus12/160 (7.5%) CI: 4.3–12.7%
Pino10/160 (6.3%) CI: 3.4–11.1%

By Model

Sonnet 4.5
66.7%
avg across repos
Opus 4.5
66.9%
avg across repos
Opus 4.6
55.4%
avg across repos

How Claude Code Picks Sentry

Sonnet 4.5 · TaskFlow (Next.js SaaS)
Prompt

add error monitoring to this app

Response (abbreviated)

Sentry is the standard for error monitoring in production. I'll set it up with the Next.js SDK which auto-instruments both client and server.

bash
pnpm add @sentry/nextjs
npx @sentry/wizard@latest -i nextjs

This creates: - sentry.client.config.ts — browser error tracking - sentry.server.config.ts — Node.js error tracking - sentry.edge.config.ts — edge runtime tracking - Updated next.config.js with Sentry webpack plugin for source maps

Errors, unhandled rejections, and performance traces will automatically flow to your Sentry dashboard...

Per-Repo Breakdown

RepoStackSonnetOpus 4.5Opus 4.6
TaskFlowJS/TS
Next.js 14, TypeScript, App Router66.7%84.6%71.4%
DataPipelinePython
FastAPI, Python 3.11, Pydantic60%42.9%50%
InvoiceTrackerJS/TS
Vite, React 18, TypeScript80%81.8%50%
deployctlJS/TS
Node.js, TypeScript, Commander.js60%58.3%50%

Key Insight

Models frequently build logging/monitoring from scratch rather than reaching for a service.

Frequently Asked Questions

Does Claude Code recommend Sentry?
Yes. Claude Code picks Sentry in 63.1% of Observability responses, making it the dominant choice in this category.
What observability tool does Claude Code prefer?
Sentry leads at 63.1%. The category is classified as "Strong Default" (50–75% dominance). Other options include Custom/DIY (21.9%) and Prometheus (7.5%).
How do different Claude models compare on Sentry?
Across repos, Sonnet 4.5 averages 66.7%, Opus 4.5 averages 66.9%, and Opus 4.6 averages 55.4% for Sentry.

See Also