Not just the cheapestโthe best value. We rank 325 AI models by quality-per-dollar using live pricing data from 84 providers.
Historical snapshot
This page is a dated monthly snapshot. For the live version that is better aligned to current rankings and search intent, use Best AI Models (Live) or jump to Best Open Source LLM.
Value Score = Quality Index รท Price per Million Tokens
Value Score
524.4
Meta via DeepInfra (Turbo, FP8)
Value Score
495.0
Alibaba via DeepInfra FP8
Value Score
451.7
Alibaba via DeepInfra (FP8)
| Rank | Model | Value | Quality | Price/M | Speed | Provider |
|---|---|---|---|---|---|---|
| 1 | Llama 3.1 Instruct 8B Meta | 524.4 | 11.8 | $0.022 | 83 tok/s | DeepInfra (Turbo, FP8) |
| 2 | Qwen3.5 0.8B (Non-reasoning) Alibaba | 495.0 | 9.9 | $0.020 | 347 tok/s | DeepInfra FP8 |
| 3 | Qwen3.5 4B (Reasoning) Alibaba | 451.7 | 27.1 | $0.060 | 199 tok/s | DeepInfra (FP8) |
| 4 | gpt-oss-120B (high) OpenAI | 433.9 | 33.3 | $0.077 | 38 tok/s | DeepInfra |
| 5 | gpt-oss-20B (high) OpenAI | 426.1 | 24.5 | $0.058 | 27 tok/s | DeepInfra |
| 6 | Qwen3.5 4B (Non-reasoning) Alibaba | 376.7 | 22.6 | $0.060 | 198 tok/s | DeepInfra FP8 |
| 7 | Qwen3.5 2B (Non-reasoning) Alibaba | 367.5 | 14.7 | $0.040 | 346 tok/s | DeepInfra FP8 |
| 8 | Qwen3 235B A22B 2507 Instruct Alibaba | 319.5 | 25 | $0.078 | 7 tok/s | DeepInfra |
| 9 | gpt-oss-20B (low) OpenAI | 308.1 | 20.8 | $0.068 | 86 tok/s | Novita |
| 10 | Qwen3 235B A22B 2507 (Reasoning) Alibaba | 295.0 | 29.5 | $0.100 | 106 tok/s | Weights & Biases |
Top models for high-volume, cost-sensitive workloads.
Self-hostable models with best API value when you can't self-host.
Llama 3.1 Instruct 8B
Meta via DeepInfra (Turbo, FP8)
$0.022
Value: 524.4
Qwen3.5 0.8B (Non-reasoning)
Alibaba via DeepInfra FP8
$0.020
Value: 495.0
Qwen3.5 4B (Reasoning)
Alibaba via DeepInfra (FP8)
$0.060
Value: 451.7
gpt-oss-120B (high)
OpenAI via DeepInfra
$0.077
Value: 433.9
gpt-oss-20B (high)
OpenAI via DeepInfra
$0.058
Value: 426.1
Use our interactive explorer to compare pricing across all 84 providers. Filter by quality, speed, and price to find your perfect model.
As of January 2026, Llama 3.1 Instruct 8B offers the best value under $1/M at $0.022/M with a quality index of 11.8. For absolute lowest cost, open source models like DeepSeek and Qwen can be self-hosted for near-zero marginal cost after infrastructure.
Llama 3.1 Instruct 8B currently offers the best quality-per-dollar with a value score of 524.4(Quality Index 11.8 at $0.022/M). This means you get more "intelligence" per dollar spent than any other model.
For most production workloads, no. Models like DeepSeek V3.2 and Gemini Flash deliver 85-95% of GPT-5's quality at 1/10th to 1/20th the cost. Use GPT-5 for: (1) complex multi-step reasoning, (2) tasks where error cost is high, (3) when you need specific OpenAI features. Use budget models for: high-volume chat, content generation, and routine tasks.
Use API if: you have <1M tokens/day, need instant scaling, or lack GPU infrastructure.Self-host if: you have >10M tokens/day (break-even point), need data privacy, or want to fine-tune. At current GPU prices, self-hosting DeepSeek V3 becomes cheaper than API around 5-10M tokens/day depending on your setup.
Self-hostable models
๐ปLiveCodeBench leaders
๐คTool use & agents
๐Expert picks
Data sources: Pricing from Artificial Analysis (live API data). Quality Index from AA Intelligence Index. Updated daily via automated pipeline.See methodology โ