Amplifying/ai-benchmarks

Playwright

End-to-end testing for modern web apps

JS/TS
Overall Rank
Not in top 20
10.5%
Pick Rate
18 of 171 (CI: 6.8–16%)
18
Primary Picks
of 171 extractable
21
Alt Picks
also mentioned 12x
Strong Default
Category Tier
59.1% winner dominance
Vitest101/171 (59.1%) CI: 51.6–66.2%
pytest44/171 (25.7%) CI: 19.8–32.8%
Playwright18/171 (10.5%) CI: 6.8–16%
Jest7/171 (4.1%) CI: 2–8.2%

By Model

Sonnet 4.5
25%
avg across repos
Opus 4.5
18.4%
avg across repos
Opus 4.6
20%
avg across repos

Per-Repo Breakdown

RepoStackSonnetOpus 4.5Opus 4.6
TaskFlowJS/TS
Next.js 14, TypeScript, App Router28.6%20%20%
InvoiceTrackerJS/TS
Vite, React 18, TypeScript21.4%16.7%20%

Key Insight

Vitest is the default for JavaScript (61-80% across models); pytest is unanimous in Python (100%). Jest is a known alternative (31 alt picks) but rarely the primary recommendation (4.1%).

Frequently Asked Questions

Does Claude Code recommend Playwright?
Playwright appears in 10.5% of Testing responses. The category leader is Vitest at 59.1%.
What testing tool does Claude Code prefer?
Vitest leads at 59.1%. The category is classified as "Strong Default" (50–75% dominance). Other options include pytest (25.7%) and Playwright (10.5%).
How do different Claude models compare on Playwright?
Across repos, Sonnet 4.5 averages 25%, Opus 4.5 averages 18.4%, and Opus 4.6 averages 20% for Playwright.

See Also