Agentic model routing

Find the right model for agent work

Match a workload to models using Agentic Index, task-level cost, response time, benchmark signals, and context requirements. The shortlist is built from the same live Artificial Analysis data used across WhatLLM.

Agentic Rankings Compare Agentic Models Explore Leaderboard

Agentic fit lab

Choose the model that fits the work

Ranking changes as task shape, autonomy, context, and economics change.

Task

Autonomy needed

Context load

Optimize for

Best fit

GPT-5.6 Sol (xhigh)

OpenAI

Fit

Agentic

51.8

Cost / task

$0.944

Response

53.2s

Context

1.0M

Codebase agentStrong autonomyWorking context

Value route

GPT-5.6 Luna (max)

OpenAI

Fit

Task cost

$0.066

Response

2.0m

Very strong task economics

Fast route

Claude Opus 5 (High Effort)

Anthropic

Fit

Task cost

$1.23

Response

24.7s

Good balance across score and economics

Open route

Kimi K3 (max)

Kimi

Fit

Task cost

$0.723

Response

1.3m

Good balance across score and economics

Frontier map

Agentic Index vs task cost

17 high-fit models

136 with cost/task

GPT-5.6 Sol (xhigh)

OpenAI

Agentic

51.8

Task cost

$0.944

Shortlist

GPT-5.6 Sol (xhigh)

51.8 agentic score

fit

Claude Opus 5 (High Effort)

52.1 agentic score

fit

Claude Opus 5 (Xhigh Effort)

54.5 agentic score

fit

GPT-5.6 Sol (high)

48.5 agentic score

fit

GPT-5.6 Sol (max)

54 agentic score

fit

Claude Fable 5 (Max Effort, Opus 4.8 Fallback)

52.8 agentic score

fit

Claude Opus 5 (max)

55.3 agentic score

fit

Model	Fit	Agentic	Task cost	Response	Context
GPT-5.6 Sol (xhigh) OpenAI · Proprietary	92	51.8	$0.944	53.2s	1.0M
Claude Opus 5 (High Effort) Anthropic · Proprietary	88	52.1	$1.23	24.7s	1.0M
Claude Opus 5 (Xhigh Effort) Anthropic · Proprietary	88	54.5	$1.80	34.3s	1.0M
GPT-5.6 Sol (high) OpenAI · Proprietary	87	48.5	$0.771	21.7s	1.0M
GPT-5.6 Sol (max) OpenAI · Proprietary	87	54	$1.54	2.6m	1.0M
Claude Fable 5 (Max Effort, Opus 4.8 Fallback) Anthropic · Proprietary	85	52.8	$2.75	1.6m	1.0M
Claude Opus 5 (max) Anthropic · Proprietary	84	55.3	$2.03	1.5m	1.0M
Kimi K3 (max) Kimi · Open	83	50.1	$0.723	1.3m	1.0M
Claude Opus 5 (Medium Effort) Anthropic · Proprietary	80	47.1	$0.618	15.6s	1.0M
GPT-5.6 Sol (medium) OpenAI · Proprietary	80	44.5	$0.514	13.6s	1.0M

Why cost per task changes the decision

Agent loops compound price and latency

A small per-token price gap can become a large bill when an agent runs many turns, calls tools, and carries long state. Cost per Intelligence Index task gives a cleaner decision unit than token price alone.

The best model depends on the failure cost

Frontier models are worth it when mistakes are expensive. For routine automation, a cheaper high-fit model can preserve most of the capability while cutting task cost sharply.

Model rankings

Current live rankings

Browse the latest ranking pages for overall models, coding, local hardware, open source, Ollama, long context, and agentic workflows.

Overall

Best AI Models

Live ranking of the best overall AI models by quality, price, speed, and context window.

Open page →

Coding

Best LLM for Coding

Current coding leaderboard using LiveCodeBench, Terminal-Bench, and SciCode.

Open page →

Open Source

Best Open Source LLM

Top open-weight models for self-hosting, Ollama, and low-cost API use.

Open page →

Self-Hosted

Best Local LLM

Best local AI models by hardware tier for self-hosting on Macs, RTX GPUs, and workstations.

Open page →

Local Coding

Best Local LLM for Coding

Coding-focused local models mapped to 8GB, 24GB, 64GB, and server-class hardware.

Open page →

Ollama

Best Ollama Models

Ollama-first picks for coding, chat, reasoning, and low-friction local inference.

Open page →

Long Context

Largest Context Window LLM

Best long-context models for large documents, codebases, and retrieval-heavy workflows.

Open page →

Agents

Best Agentic Models

Rankings for tool use, multi-step execution, and autonomous agent workflows.

Open page →