Document Arena

View overall rankings across AI models in document analysis and long-content reasoning.

Jul 1, 2026

288,779 votes

32 models

Rank by

	Rank Spread
1	15	claude-opus-4-6-thinking Anthropic · Proprietary	1505±7	23,132	$5 / $25	1M
2	15	claude-opus-4-6 Anthropic · Proprietary	1505±6	35,514	$5 / $25	1M
3	16	claude-opus-4-7 Anthropic · Proprietary	1500±7	17,103	$5 / $25	1M
4	18	claude-opus-4-7-thinking Anthropic · Proprietary	1500±7	16,879	$5 / $25	1M
5	110	claude-fable-5 Anthropic · Proprietary	1497±12	2,545	$10 / $50	1M
6	411	claude-sonnet-4-6 Anthropic · Proprietary	1488±6	52,833	$3 / $15	1M
7	412	gpt-5.5-high OpenAI · Proprietary	1487±7	14,781	$5 / $30	1.1M
8	513	gpt-5.5 OpenAI · Proprietary	1481±7	15,246	$5 / $30	1.1M
9	314	gemini-3.5-flash Google · Proprietary	1481±16	1,333	$1.50 / $9	1M
10	614	claude-opus-4-8-thinking Anthropic · Proprietary	1477±8	6,376	$5 / $25	1M
11	516	claude-sonnet-5-thinking Anthropic · Proprietary	1476±17	1,267	$2 / $10	1M
12	714	gpt-5.4 OpenAI · Proprietary	1475±6	27,648	$2.50 / $15	1.1M
13	816	claude-opus-4-8 Anthropic · Proprietary	1468±9	6,117	$5 / $25	1M
14	918	claude-opus-4-5-20251101 Anthropic · Proprietary	1461±10	7,987	$5 / $25	200K
15	1218	kimi-k2.6 Moonshot · Modified MIT	1454±8	10,427	$0.95 / $4	262.1K
16	1421	claude-sonnet-4-5-20250929 Anthropic · Proprietary	1447±6	26,859	$3 / $15	200K
17	1226	muse-spark Meta · Proprietary	1444±18	1,089	N/A	N/A
18	1622	gemini-3.1-pro-preview Google · Proprietary	1441±5	42,482	$2 / $12	1M
19	1425	qwen3.7-plus Alibaba · Proprietary	1439±13	2,051	$0.32 / $1.28	1M
20	1624	minimax-m3 MiniMax · MiniMax Community License	1435±8	6,367	$0.60 / $2.40	N/A
21	1626	gemini-3-pro Google · Proprietary	1433±9	10,752	$2 / $12	1M
22	1727	kimi-k2.5-thinking Moonshot · Modified MIT	1429±7	18,646	$0.60 / $3	N/A
23	1828	gemma-4-31b Google · Apache 2.0	1423±8	9,512	N/A	N/A
24	1828	gemini-2.5-pro Google · Proprietary	1421±6	25,094	$1.25 / $10	1M
25	1928	claude-haiku-4-5-20251001 Anthropic · Proprietary	1421±6	29,105	$1 / $5	200K
26	2231	grok-4.20-beta-0309-reasoning xAI · Proprietary	1416±7	16,701	$2 / $6	2M
27	2032	glm-5v-turbo Z.ai · Proprietary	1416±11	3,334	$1.20 / $4	202.8K
28	2332	gemini-3-flash Google · Proprietary	1413±9	7,181	$0.50 / $3	1M
29	2632	gpt-5.2-high OpenAI · Proprietary	1405±9	7,091	$1.75 / $14	400K
30	2632	gpt-5.5-instant OpenAI · Proprietary	1403±8	8,536	$5 / $30	1.1M
31	2732	gpt-5.2 OpenAI · Proprietary	1401±6	28,283	$1.75 / $14	400K
32	2632	gpt-5.1 OpenAI · Proprietary	1401±9	8,247	$1.25 / $10	400K

Document Arena

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Document Arena

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)