Are DeepSeek models open source or open weights?

Many DeepSeek releases are open weights. That means you can access model weights, but license terms vary by release and by any distillation base. Always verify the specific license for the exact model variant you deploy.

Which DeepSeek model should I start with?

Start with the strongest general DeepSeek model you can afford, then validate latency and cost on your target provider. For many teams, the best first step is to benchmark one DeepSeek V3 style model and one DeepSeek R1 style model on your own prompts.

Where can I run DeepSeek models?

DeepSeek models are offered by many providers. Pricing, speed, and time to first token vary widely across hosts, even for the same model. Use the provider explorer in this article to find the best tradeoff for your workload.

DeepSeek models: what to use, what to skip, and where to run them

DeepSeek models we track

Provider endpoints

Unique providers

Most recent release

Dec 1, 2025

Data source: Artificial Analysis model and provider leaderboards (snapshot January 9, 2026).

Who builds DeepSeek

DeepSeek is a Chinese AI lab. It has been widely reported as founded by Liang Wenfeng, and associated with High-Flyer, a quantitative trading firm. The reason this matters is not nationalism or drama. It is incentives. A lab that can fund long training runs without selling an enterprise product can optimize for research outcomes, publish more, and move faster.

Most people talk about DeepSeek like it is only weights. In practice, the DeepSeek story is about a loop: fast iteration on training recipes, aggressive efficiency work, then rapid distribution across providers. That distribution is what shows up in your costs and latency today.

Sources for company context: Wikipedia, TechTarget.

Model evolution: a timeline you can skim

DeepSeek has released a lot of names. Some are genuine architectural jumps, others are post-training revisions, and others are packaging choices that show up as "reasoning" vs "non-reasoning" in downstream distributions. This is a selected timeline intended for orientation, not an exhaustive catalog.

Release window	Family	What changed	Why it matters
Late 2023	DeepSeek-Coder	Early coding-specialized releases and instruct variants.	Established the lab as serious about developer workloads.
Mid 2024	DeepSeek-V2, Coder-V2	More competitive general models and stronger coding variants.	Triggered price pressure across the China hosting ecosystem.
Dec 2024	DeepSeek-V3	Mixture of experts style scaling that moved the quality ceiling for open weights.	Created a base that later reasoning variants could build on.
Jan 2025	DeepSeek-R1	Reasoning-focused releases with heavier post-training.	Marked the shift from "good open weights" to "serious reasoning open weights".
2025	V3 revisions, R1 revisions	Iterative improvements and packaging for tool use.	Most buyers feel this as better endpoints and better defaults, not as a new paper.

Sources for the release overview: TechTarget, Tom's Hardware, Le Monde.

Reasoning vs non-reasoning variants

In practice, "reasoning" is not a brand. It is an operating mode. Reasoning variants tend to spend more tokens thinking, which can raise cost and push up latency, but they can also be meaningfully stronger on multi-step tasks. Non-reasoning variants are often better defaults for interactive products where speed and throughput matter.

Below is a summary across the provider endpoints we track. We group variants by how they are commonly presented in the ecosystem, such as "Reasoning", "Thinking", and "Non-reasoning". Treat this as a practical guide for selection and cost planning, not a claim about internal architecture.

Group	Models	Endpoints	Providers	Median blended $/1M	Best blended $/1M	Median speed tok/s	Median TTFT s
Reasoning	5	17	11	$0.4525	$0.3018	43	1.40
Non-reasoning	4	26	17	$0.4838	$0.1225	60	0.89
Other	16	41	18	$1.200	$0.0675	51	0.86

What most people get wrong about DeepSeek

Most discussions compress DeepSeek into a single point on a chart. That is a category error. You are actually making two decisions: which DeepSeek variant fits your workload, and which provider delivers the economics and latency you need. For many teams, the provider decision moves the needle more than switching between closely related variants.

This guide is built for builders. It is not a hype piece, and it is not a benchmark dump. The goal is to help you choose a default, understand when it fails, and make the provider choice with eyes open.

The headline numbers

In the current snapshot, we track 25 DeepSeek models with a non-zero quality index and 84 provider endpoints across 29 hosts. DeepSeek is widely distributed, which is exactly why pricing and latency can vary so much.

The ceiling model in this snapshot

The highest quality DeepSeek model in this snapshot is DeepSeek V3.2 (Reasoning) at QI 41.2, with a 128,000 token context window. Use it when quality is the priority and you can afford the endpoint tradeoffs.

The value pick

If you care about quality per dollar, the best value pick in this snapshot is DeepSeek V3.2 (Non-reasoning). Its cheapest observed blended price is $0.1225 per 1M tokens on GMI. Do not read this as "always choose the cheapest". Read it as a shortlist candidate for high-volume workloads.

Why DeepSeek stands out

The most important DeepSeek advantage is not a single benchmark. It is the combination of fast iteration and aggressive distribution. When a model family shows up across many providers, competition forces down blended price and increases the chance that at least one host has a strong throughput or latency profile. That is why the "where to run it" question matters so much.

The second advantage is that DeepSeek releases have consistently targeted efficiency. Efficiency is not only training cost. It is inference economics, routing, quantization friendliness, and real-world provider viability. If your workload is high-volume, this matters more than a small difference in a single benchmark.

Provider reality: same model, very different outcomes

DeepSeek endpoints can differ across price, throughput, and time to first token. If you are building an interactive product, time to first token often matters as much as raw throughput. If you are doing batch workloads, blended price and throughput dominate. Use the explorer below to pick the model you care about and sort providers by the metric that maps to your actual constraints.

Provider explorer

Pick a DeepSeek model and see which providers are cheapest, fastest, or lowest latency.

ModelSort by

Provider	Input $/1M	Output $/1M	Blended $/1M	Speed tok/s	TTFT s
Novita	0.2690	0.4000	0.3018	32	1.45
SiliconFlow (FP8)	0.2700	0.4200	0.3075	43	2.20
DeepSeek	0.2800	0.4200	0.3150	29	1.38
Parasail (FP8)	0.2800	0.4500	0.3225	8	1.21
Baseten	0.3000	0.4500	0.3375	63	4.69
Google Vertex	0.5600	1.680	0.8400	51	0.52

Prices are in dollars per 1M tokens. Speed and latency are medians from the AA provider leaderboard.

Compare DeepSeek in the app

Data source: Artificial Analysis. This page is a research aid, not a guarantee of provider availability or pricing.

A practical shortlist, with provider signals

Below is a compact table for decision-making. It lists the top DeepSeek models by quality index, then shows the cheapest observed endpoint, plus the fastest and lowest-latency endpoints where available. This is the easiest way to spot models that look good on paper but lack mature provider coverage.

Model	QI	Context	Cheapest provider	Blended $/1M	Fastest provider	Tok/s	Lowest latency provider	TTFT s
DeepSeek V3.2 (Reasoning)	41.2	128,000	Novita	$0.3018	Baseten	63	Google Vertex	0.52
DeepSeek V3.2 Speciale	34.1	128,000	Parasail (FP8)	$0.4250	Parasail (FP8)	26	Parasail (FP8)	0.88
DeepSeek V3.1 Terminus (Reasoning)	33.4	128,000	Novita (FP8)	$0.4525	SambaNova	172	Novita (FP8)	1.57
DeepSeek V3.2 Exp (Reasoning)	32.5	128,000	Novita (FP8)	$0.3050	Novita (FP8)	37	Novita (FP8)	0.80
DeepSeek V3.2 (Non-reasoning)	31.8	128,000	GMI	$0.1225	Fireworks	218	Google Vertex	0.51
DeepSeek V3.2 Exp (Non-reasoning)	28.1	128,000	DeepInfra	$0.2375	Novita (FP8)	33	Novita (FP8)	0.78
DeepSeek V3.1 Terminus (Non-reasoning)	28	128,000	DeepInfra (FP4)	$0.3550	SambaNova	271	Fireworks	0.52
DeepSeek V3.1 (Reasoning)	27.9	128,000	GMI (FP8)	$0.4525	Google Vertex	293	Google Vertex	0.67

When not to use DeepSeek

DeepSeek is not a universal default. If you need the most conservative behavior under uncertainty, or you are shipping into strict safety and compliance constraints, you may prefer a proprietary model with mature policy tooling and stable enterprise routing. If you are extremely latency-sensitive and cannot tolerate cold starts, pick the provider first, then the model, and validate time to first token on your own prompts.

How to use this page to make a decision

If you are starting from scratch, here is a simple workflow. Pick two models. One should be the ceiling model by quality. One should be the value pick. Then choose two providers for each: one optimized for cost, one optimized for latency. Run your own prompt suite and measure quality regressions and tail latencies. After that, the decision usually becomes obvious.

Keep exploring

If you want the bigger context around why open weights are compounding, read open source vs proprietary LLMs and three forces that broke OpenAI's moat. If your workload is code-heavy, start with best coding models. If you are looking for broad open-weights coverage, use best open source models. If you are explicitly optimizing for hard reasoning tasks, start with best novel reasoning models.

Try it in WhatLLM

Want to sanity check the provider choice in context? Jump into the compare tool and start from a real model, not a vague family label.

Compare a DeepSeek model Explore all endpoints

Note: prices are taken from a point-in-time snapshot and can change. Always verify current provider pricing before committing.