Compare models
Comparing 4models. Drop the URL into a doc — it's permalinked.
| Field | google/gemini-2.5-pro | openai/o3-mini | anthropic/claude-opus-4-5 | openai/o1 |
|---|---|---|---|---|
| Provider | openai | anthropic | openai | |
| Model ID | gemini-2.5-pro | o3-mini | claude-opus-4-5 | o1 |
| Context | 1.0M | 200K | 200K | 200K |
| Max output | 8K | 100K | 32K | 100K |
| Input / 1M | $1.25 | $1.10 | $15.00 | $15.00 |
| Output / 1M | $10.00 | $4.40 | $75.00 | $60.00 |
| Cached input / 1M | $0.31 | $0.55 | $1.50 | $7.50 |
| Avg cost / 1M | $5.63 | $2.75 | $45.00 | $37.50 |
| Speed | 130 t/s | 95 t/s | 65 t/s | 35 t/s |
| Quality index | 80.0 | 78.0 | 80.0 | 85.0 |
| MMLU | 89.2 | 86.5 | 91.4 | 92.3 |
| GPQA | 84.0 | 79.7 | 79.6 | 78.0 |
| HumanEval | 92.6 | 87.8 | 92.7 | 89.0 |
| MATH | 91.6 | 97.9 | 96.5 | 94.8 |
| SWE-bench | 63.8 | — | 72.5 | 48.9 |
| Arena Elo | 1380 | — | — | — |
| Tools | ✓ | ✓ | ✓ | ✓ |
| Vision | ✓ | — | ✓ | ✓ |
| Thinking | ✓ | ✓ | ✓ | ✓ |
| Streaming | ✓ | ✓ | ✓ | ✓ |
| JSON mode | ✓ | — | — | — |
| Structured output | ✓ | ✓ | ✓ | ✓ |
| Prompt cache | — | — | ✓ | — |
Same data, in your terminal: relay models compare gemini-pro openai/o3-mini opus-4.5 o1-pro