
โก Loading model data...
The definitive ranking of open-weight AI models you can self-host, fine-tune, and deploy without restrictions. This page now uses the same live dataset and open-source filtering logic as the Explore leaderboard, so the ordering matches the table view across the app.
Quality Index
53.905
Kimi
Quality Index
53.829
Xiaomi
Quality Index
51.814
Alibaba
| Rank | Model | Quality | Best Price | Top Speed | Max Context | Providers |
|---|---|---|---|---|---|---|
| 1 | Kimi K2.6 Kimi | 53.905 | $1.44/M | 253 tok/s | 262K | 14 |
| 2 | MiMo-V2.5-Pro Xiaomi | 53.829 | $1.20/M | 60 tok/s | 1M | 4 |
| 3 | Qwen3.6 Max Preview Alibaba | 51.814 | $2.92/M | 36 tok/s | 256K | 1 |
| 4 | DeepSeek V4 Pro DeepSeek | 51.509 | $1.42/M | 173 tok/s | 1M | 9 |
| 5 | GLM-5.1 (Reasoning) Z AI | 51.408 | $1.66/M | 177 tok/s | 205K | 10 |
| 6 | Qwen3.6 Plus Alibaba | 49.985 | $1.13/M | 53 tok/s | 1M | 1 |
| 7 | GLM-5 (Reasoning) Z AI | 49.77 | $1.24/M | 189 tok/s | 203K | 8 |
| 8 | MiniMax-M2.7 MiniMax | 49.615 | $0.52/M | 452 tok/s | 205K | 6 |
| 9 | MiMo-V2-Pro Xiaomi | 49.202 | $1.50/M | 66 tok/s | 131K | 1 |
| 10 | MiMo-V2.5 Xiaomi | 49.034 | $0.64/M | 91 tok/s | 1M | 2 |
| 11 | Kimi K2.5 (Reasoning) Kimi | 46.813 | $0.90/M | 396 tok/s | 262K | 13 |
| 12 | DeepSeek V4 Flash DeepSeek | 46.517 | $0.01/M | 114 tok/s | 1M | 6 |
Ollama makes it easy to run open-weight models locally. Here are the top picks by hardware tier:
RTX 3070 ยท M2 MacBook Air
RTX 3090/4090 ยท M2 Pro/Max
2ร RTX 4090 ยท Mac Studio M2 Ultra
Tip: Use 4-bit quantization (Q4_K_M) to roughly halve VRAM requirements with minimal quality loss. For example, Llama 3.3 70B at Q4_K_M runs in ~40GB.
Only models with openly available weights are included. This page uses the same grouped Explore dataset and open-source filter as the live leaderboard, then orders those model families by Artificial Analysis Quality Index. We still surface coding benchmarks here when they are available:
Comprehensive knowledge benchmark across 14 domains. Tests breadth of model capability.
Competition math โ tests advanced reasoning. Best signal for math and science tasks.
Contamination-free code generation. Best signal for software development capability.
See live pricing from self-hosting providers, latency, and full benchmark scores for all open source models.
Kimi K2.6 leads the live open-source ranking with a Quality Index of 53.905. For price-sensitive deployments, DeepSeek V4 Flash is the current budget leader at $0.01/M.
For Ollama and other self-hosted setups, start from the strongest live open-weight models here, then narrow by hardware tier. MiMo-V2.5-Pro is the best option here when context size matters most.
For most tasks, yes. The top open source models in 2026 trail proprietary leaders by only a small Quality Index margin. The main gaps remain in instruction-following polish, multimodal capability, and very long contexts.
As a rule of thumb, 8GB VRAM is enough for smaller 7B to 8B models, 24GB VRAM is a more practical floor for 30B-class models, and 40GB+ is usually required once you move into 70B territory unless you quantize aggressively. Apple Silicon is also viable for smaller and mid-sized open-weight models when unified memory is high enough.
Kimi K2.6 is the strongest coding-oriented open source model on this page. For cheap API access, DeepSeek V4 Flash is the most economical current pick. See the full coding LLM ranking for more detail.
All models ranked
๐งฎAIME 2025 rankings
๐ฅ๏ธSelf-hosted picks
๐ฆOllama-first picks
๐Best for documents
โกCompare any 2โ4 models
Data sources: Rankings use the same live Explore dataset and open-source filter as the methodology page describes. Explore the filtered table in our interactive leaderboard or compare models side by side.