Amplifying/ai-benchmarks

Prometheus

Open-source monitoring and alerting toolkit

Python
Overall Rank
Not in top 20
7.5%
Pick Rate
12 of 160 (CI: 4.3–12.7%)
12
Primary Picks
of 160 extractable
3
Alt Picks
also mentioned 2x
Strong Default
Category Tier
63.1% winner dominance

In Observability

Full comparison →
Sentry101/160 (63.1%) CI: 55.4–70.2%
Custom/DIY35/160 (21.9%) CI: 16.2–28.9%
Prometheus12/160 (7.5%) CI: 4.3–12.7%
Pino10/160 (6.3%) CI: 3.4–11.1%

By Model

Sonnet 4.5
40%
avg across repos
Opus 4.5
35.7%
avg across repos
Opus 4.6
0%
avg across repos

Per-Repo Breakdown

RepoStackSonnetOpus 4.5Opus 4.6
DataPipelinePython
FastAPI, Python 3.11, Pydantic40%35.7%

Key Insight

Models frequently build logging/monitoring from scratch rather than reaching for a service.

Frequently Asked Questions

Does Claude Code recommend Prometheus?
Prometheus appears in 7.5% of Observability responses. The category leader is Sentry at 63.1%.
What observability tool does Claude Code prefer?
Sentry leads at 63.1%. The category is classified as "Strong Default" (50–75% dominance). Other options include Custom/DIY (21.9%) and Prometheus (7.5%).
How do different Claude models compare on Prometheus?
Across repos, Sonnet 4.5 averages 40%, Opus 4.5 averages 35.7%, and Opus 4.6 averages 0% for Prometheus.

See Also