Not just the cheapest—the best value. We rank 257 AI models by quality-per-dollar using live pricing data from 69 providers.
Value Score = Quality Index ÷ Price per Million Tokens
Value Score
1456.0
Meta via DeepInfra
Value Score
528.9
Meta via DeepInfra (Turbo, FP8)
Value Score
485.0
Meta via DeepInfra
| Rank | Model | Value | Quality | Price/M | Speed | Provider |
|---|---|---|---|---|---|---|
| 1 | Llama 3.2 Instruct 1B Meta | 1456.0 | 9.1 | $0.63¢ | 20 tok/s | DeepInfra |
| 2 | Llama 3.1 Instruct 8B Meta | 528.9 | 11.9 | $0.022 | 91 tok/s | DeepInfra (Turbo, FP8) |
| 3 | Llama 3.2 Instruct 3B Meta | 485.0 | 9.7 | $0.020 | 19 tok/s | DeepInfra |
| 4 | Gemma 3n E4B Instruct | 436.0 | 10.9 | $0.025 | 47 tok/s | Together.ai |
| 5 | gpt-oss-120B (high) OpenAI | 428.7 | 32.9 | $0.077 | 76 tok/s | DeepInfra |
| 6 | gpt-oss-20B (high) OpenAI | 426.1 | 24.5 | $0.058 | 182 tok/s | DeepInfra |
| 7 | gpt-oss-20B (low) OpenAI | 312.6 | 21.1 | $0.068 | 254 tok/s | Novita |
| 8 | MiMo-V2-Flash (Reasoning) Xiaomi | 260.0 | 39 | $0.150 | 127 tok/s | Xiaomi |
| 9 | DeepSeek V3.2 (Non-reasoning) DeepSeek | 259.6 | 31.8 | $0.122 | 118 tok/s | GMI |
| 10 | Qwen3 8B (Reasoning) Alibaba | 251.9 | 15.3 | $0.061 | 55 tok/s | Novita (FP8) |
Top models for high-volume, cost-sensitive workloads.
Self-hostable models with best API value when you can't self-host.
Llama 3.2 Instruct 1B
Meta via DeepInfra
$0.63¢
Value: 1456.0
Llama 3.1 Instruct 8B
Meta via DeepInfra (Turbo, FP8)
$0.022
Value: 528.9
Llama 3.2 Instruct 3B
Meta via DeepInfra
$0.020
Value: 485.0
Gemma 3n E4B Instruct
Google via Together.ai
$0.025
Value: 436.0
gpt-oss-120B (high)
OpenAI via DeepInfra
$0.077
Value: 428.7
Use our interactive explorer to compare pricing across all 69 providers. Filter by quality, speed, and price to find your perfect model.
As of January 2026, Llama 3.2 Instruct 1B offers the best value under $1/M at $0.63¢/M with a quality index of 9.1. For absolute lowest cost, open source models like DeepSeek and Qwen can be self-hosted for near-zero marginal cost after infrastructure.
Llama 3.2 Instruct 1B currently offers the best quality-per-dollar with a value score of 1456.0(Quality Index 9.1 at $0.63¢/M). This means you get more "intelligence" per dollar spent than any other model.
For most production workloads, no. Models like DeepSeek V3.2 and Gemini Flash deliver 85-95% of GPT-5's quality at 1/10th to 1/20th the cost. Use GPT-5 for: (1) complex multi-step reasoning, (2) tasks where error cost is high, (3) when you need specific OpenAI features. Use budget models for: high-volume chat, content generation, and routine tasks.
Use API if: you have <1M tokens/day, need instant scaling, or lack GPU infrastructure.Self-host if: you have >10M tokens/day (break-even point), need data privacy, or want to fine-tune. At current GPU prices, self-hosting DeepSeek V3 becomes cheaper than API around 5-10M tokens/day depending on your setup.
Self-hostable models
💻LiveCodeBench leaders
🤖Tool use & agents
🏆Expert picks
Data sources: Pricing from Artificial Analysis (live API data). Quality Index from AA Intelligence Index. Updated daily via automated pipeline.See methodology →