LLM Pricing Comparison
NewCompare input/output token prices across all major LLM providers. Filter by capability, context window, speed, and cost. Estimate your monthly API bill with the request volume calculator.
| Model | $/1M in ↑ | $/1M out | Cost/req | 30d cost | Context | Quality |
|---|---|---|---|---|---|---|
Googlegemini-2.0-flash fastcheap | $0.1 | $0.4 | $0.0018 | $54 | 1000K | |
Mistralmistral-small-3.1 fastcheap | $0.1 | $0.3 | $0.0016 | $48 | 128K | |
OpenAIgpt-4o-mini fastcheap | $0.15 | $0.6 | $0.0027 | $81 | 128K | |
Googlegemini-2.5-flash fastbalanced | $0.15 | $0.6 | $0.0027 | $81 | 1000K | |
Metallama-3.3-70b open-sourcefast | $0.23 | $0.4 | $0.0031 | $93 | 128K | |
DeepSeekdeepseek-v3 cheapopen-source | $0.27 | $1.1 | $0.0049 | $147 | 64K | |
Metallama-4-maverick open-sourcebalanced | $0.5 | $0.77 | $0.0065 | $196 | 128K | |
DeepSeekdeepseek-r1 reasoningcheap | $0.55 | $2.19 | $0.0099 | $296 | 64K | |
Anthropicclaude-haiku-4-5 fastcheap | $0.8 | $4 | $0.0160 | $480 | 200K | |
OpenAIo4-mini reasoningfast | $1.1 | $4.4 | $0.0198 | $594 | 200K | |
Googlegemini-2.5-pro flagshiplong-context | $1.25 | $5 | $0.0225 | $675 | 1000K | |
OpenAIgpt-4.1 flagshiplong-context | $2 | $8 | $0.0360 | $1080 | 1024K | |
Mistralmistral-large-2 flagshipbalanced | $2 | $6 | $0.0320 | $960 | 128K | |
OpenAIgpt-4o flagshipvision | $2.5 | $10 | $0.0450 | $1350 | 128K | |
Coherecommand-r-plus ragenterprise | $2.5 | $10 | $0.0450 | $1350 | 128K | |
Anthropicclaude-sonnet-4-6 balancedlong-context | $3 | $15 | $0.0600 | $1800 | 200K | |
xAIgrok-3 flagshipbalanced | $3 | $15 | $0.0600 | $1800 | 131K | |
OpenAIo3 reasoningflagship | $10 | $40 | $0.1800 | $5400 | 200K | |
Anthropicclaude-opus-4-6 flagshiplong-context | $15 | $75 | $0.3000 | $9000 | 200K |
Prices as of early 2026. Verify with provider pricing pages before production use. Cost calculations assume the token counts you entered above.
Frequently Asked Questions
Which model is the cheapest for high-volume applications?
For high-volume applications, Gemini 2.0 Flash ($0.10/1M input) and GPT-4o-mini ($0.15/1M input) are among the cheapest options with reasonable quality. DeepSeek-V3 is also very cost-effective for its quality level.
What does 'context window' mean?
The context window is the maximum number of tokens a model can process in a single request, including both input and output. Larger context windows allow you to process longer documents or maintain longer conversation histories.
How is monthly cost calculated?
Monthly cost = (input tokens × input price + output tokens × output price) × requests per day × 30 days. Prices are per 1 million tokens as listed by each provider.