⚡ Loading model data...

Side by side

Compare LLMs side by side

Compare any AI models on one screen using live benchmarks, pricing, output speed, and context window data — up to 4 models at once, with a full benchmark breakdown when you need to go deeper.

Compare Claude Opus 4.7 Compare Gemini 3.1 Pro Preview Compare GPT-5.4 (xhigh)Compare GPT-5.3 Codex (xhigh)

⚖️ Models 🧾 By model🧾 Model 🔀 By host🔀 Hosts

Compare LLMs Side-by-Side

Select up to 4 models to analyze quality, performance, pricing, and benchmarks.

Filter by Provider:

Add Model

gpt-oss-120B (low)

Quick picks from top 10 models by Quality Index

Highest Quality

gpt-oss-120B (low)

👑

24.47

Fastest Output

gpt-oss-120B (low)

⚡

49 tok/s

Best Value

gpt-oss-120B (low)

💰

$0.10/M

Performance Diamond

Quality, Speed, Context, and Value at a glance

Detailed Specifications

Metric	gpt-oss-120B (low)
Creator	OpenAI
Quality Index	24.47👑
Price per Million Tokens	$0.10💰
Output Speed	49 tok/s⚡
Context Window	131K📚
Latency (TTFT)	0.56s🚀
Provider	Novita18 more providers

Monthly Cost Estimator

Estimated cost based on 1 million tokens per day usage

gpt-oss-120B (low)

/month

What to look at when comparing LLMs

Benchmarks tell you where a model is strong

Use coding, reasoning, math, and tool-use benchmarks to see where a model is actually strong instead of relying on a single overall score. A model that leads in quality may still be wrong for your workflow if your primary constraint is latency or cost.

Provider reality matters as much as the model

The same model can be cheap on one host and expensive on another, or fast on one provider and unusable on the next. If the model looks promising, move to provider comparison before you commit.

Model rankings

Current live rankings

Browse the latest ranking pages for overall models, coding, open source, Ollama, long context, and agentic workflows.

Overall

Best AI Models

Live ranking of the best overall AI models by quality, price, speed, and context window.

Open page →

Coding

Best LLM for Coding

Current coding leaderboard using LiveCodeBench, Terminal-Bench, and SciCode.

Open page →

Open Source

Best Open Source LLM

Top open-weight models for self-hosting, Ollama, and low-cost API use.

Open page →

Self-Hosted

Best Local LLM

Best local AI models by hardware tier for self-hosting on Macs, RTX GPUs, and workstations.

Open page →

Ollama

Best Ollama Models

Ollama-first picks for coding, chat, reasoning, and low-friction local inference.

Open page →

Long Context

Largest Context Window LLM

Best long-context models for large documents, codebases, and retrieval-heavy workflows.

Open page →

Agents

Best Agentic Models

Rankings for tool use, multi-step execution, and autonomous agent workflows.

Open page →