Find the best AI model for your use case. Start with the live evergreen rankings for coding, open source, agentic workflows, and long context, then use the monthly archive pages when you want dated snapshots and historical ranking context.
Model rankings
Browse the latest ranking pages for overall models, coding, open source, Ollama, long context, and agentic workflows.
Live ranking of the best overall AI models by quality, price, speed, and context window.
Current coding leaderboard using LiveCodeBench, Terminal-Bench, and SciCode.
Top open-weight models for self-hosting, Ollama, and low-cost API use.
Best local AI models by hardware tier for self-hosting on Macs, RTX GPUs, and workstations.
Ollama-first picks for coding, chat, reasoning, and low-friction local inference.
Best long-context models for large documents, codebases, and retrieval-heavy workflows.
Rankings for tool use, multi-step execution, and autonomous agent workflows.
Top AI models for programming ranked by LiveCodeBench, Terminal-Bench, and SciCode.
Top 3:
Top open-weight models you can self-host, fine-tune, and deploy without restrictions. Featuring Kimi K2.5.
Top 3:
Best self-hosted models for local inference on consumer and workstation hardware.
Top 3:
Ollama-first recommendations for coding, general use, reasoning, and smaller local machines.
Top 3:
Cheapest AI APIs ranked by quality-per-dollar. Best value without sacrificing performance.
Top 3:
Top AI models for image understanding ranked by MMMU Pro and LM Arena Vision.
Top 3:
Top models for mathematical reasoning ranked by AIME 2025 and GPQA Diamond.
Top 3:
Top models for autonomous agents, tool use, and multi-step task completion.
Top 3:
AI models with the largest context windows for processing massive documents.
Top 3:
Our editorial picks combining benchmarks with real-world testing and experience.
Top 3:
GPQA Diamond, AIME 2025, LiveCodeBench, MMLU-Proโnot synthetic tests.
Rankings refresh weekly as new models launch and benchmarks update.
We track pricing from 30+ API providers so you can find the best value.
Tokens per second and latency measured across different providers.
Data source: All rankings use the Artificial Analysis Intelligence Indexโthe most comprehensive independent evaluation of AI model quality, pricing, and speed.
It depends on your use case. For overall quality, Claude Opus 4.5 and GPT-5.2 lead. For cost-efficiency, DeepSeek V3.2 and Qwen3-235B offer 90%+ quality at 1/10th the price. For self-hosting, GLM-4.7 Thinking provides frontier-level performance under an MIT license.
We update rankings weekly as new models launch and benchmark data becomes available. Major updates (like new model releases from OpenAI, Anthropic, or Google) are reflected within 24-48 hours.
We use the Artificial Analysis Intelligence Index, which combines: GPQA Diamond (PhD-level reasoning), AIME 2025 (competition math), LiveCodeBench (fresh coding problems), and MMLU-Pro (general knowledge).
Use our interactive tools to explore all 100+ models with custom filters for price, speed, and benchmarks.