Weekly BriefingApril 2026

New AI Models April 2026: Anthropic Won't Ship Its Best. Open Source Will.

Anthropic confirmed Claude Mythos and locked it behind a 50-company firewall. The same day, Zhipu AI open-sourced a model that beat GPT-5.4 on coding. Google released its strongest open-weight family yet. Eight releases in seven days, and the real story isn't the models — it's who gets to use them.

By Dylan Bristot--18 min read

Early April 2026 at a glance

8+
Models shipped in 7 days
5
Open-weight
744B
Largest MoE (GLM-5.1)
$0
GLM-5.1 (MIT license)
$25/M
Mythos input tokens

The price range between the week's most powerful models: free to $125 per million output tokens. The split is no longer about capability. It's about control.

Two press releases, one fracture

April 7, 2026. Two announcements, twelve hours apart.

Anthropic confirmed that Claude Mythos exists — the most capable model it has ever built — and said it will not be publicly available. Fifty organizations get gated access under a program called Project Glasswing. Their job: use Mythos defensively to scan their own infrastructure for vulnerabilities before attackers can weaponize the model's capabilities. Preview pricing: $25 per million input tokens, $125 per million output. No public API. No general availability date.

The same day, Zhipu AI released GLM-5.1 under the MIT license. A 744-billion-parameter mixture-of-experts model with 40 billion active per forward pass, 200K context window. On SWE-Bench Pro — expert-level real-world software engineering — it reportedly beat both Claude Opus 4.6 and GPT-5.4. Cost to use: whatever your electricity costs.

That contrast is the story of early April 2026. Not a benchmark race. Not a pricing war. A philosophical split that has been building for months and is now out in the open. The industry's most capable models are being built faster than anyone can agree on who should use them.

The full release list

Everything that shipped between April 1 and April 8, 2026. Not all of these are traditional text LLMs — the line between "language model" and "AI model" stopped being meaningful sometime around February.

DateModelDeveloperTypeLicensePrice
Apr 1Gemma 4 27BGoogleText + Image + AudioOpen (Apache 2.0)Free (self-host)
Apr 1Gemma 4 26B-A4BGoogleText + Image + AudioOpen (Apache 2.0)Free (self-host)
Apr 1Gemma 4 E2B / E4BGoogleText + Image + AudioOpen (Apache 2.0)Free (self-host)
Apr 1GLM-5V-TurboZhipu AIVision + CodeProprietary (API)API pricing
Apr 1Bonsai 8BPrismMLTextOpenFree (self-host)
Apr 2Qwen 3.6-PlusAlibabaText + AgenticOpen~$0.28/M
Apr 2MAI-Transcribe-1MicrosoftSpeech-to-TextProprietaryAzure pricing
Apr 2MAI-Voice-1MicrosoftVoice GenerationProprietaryAzure pricing
Apr 2MAI-Image-2MicrosoftImage + Video GenProprietaryAzure pricing
Apr 7GLM-5.1Zhipu AIText + ReasoningOpen (MIT)~$1/$3.2 per M
Apr 7Claude MythosAnthropicText + Reasoning + CyberProprietary (gated)~$25/$125 per M

Data from developer announcements, LLM Stats, and PricePerToken. Highlighted rows are the two models that defined the week. Prices are approximate and may vary by provider.

Claude Mythos: the model Anthropic built and then locked away

The Mythos story started as a leak. On March 28, security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) found roughly 3,000 unpublished assets sitting in an unprotected, publicly searchable Anthropic database. Among them: draft blog posts describing a next-generation model codenamed "Capybara" in one version and "Claude Mythos" in another. Anthropic patched the exposure the same night after Fortune reached out for comment.

Then, on April 7, Anthropic moved Mythos into controlled early access under Project Glasswing. This isn't a beta test. It's a defensive deployment. The partner list reads like a who's-who of critical infrastructure: AWS, Apple, Microsoft, Google, NVIDIA, Cisco, CrowdStrike, JPMorgan, Broadcom, Palo Alto Networks, the Linux Foundation, and roughly 40 others. Their mandate is specific: use Mythos to scan their own systems and open-source codebases for exploitable vulnerabilities before the model reaches a wider audience.

Claude Mythos — what we know

Capability"A step change and the most capable we've ever built." Dramatically higher scores in coding, academic reasoning, and cybersecurity vs. Claude Opus 4.6. Can scan entire OS kernels and large codebases for exploitable flaws — including bugs that have gone undetected for decades.
AccessProject Glasswing only. ~50 partner organizations. No public API, no general release date.
Pricing~$25 / $125 per million input/output tokens (preview). Anthropic says it needs to become "more efficient" before broader rollout.
Why gatedAnthropic explicitly views the model's offensive cyber potential as too dangerous for broad release. The internal drafts warn it "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

Read this in context. Anthropic spent March in a standoff with the Pentagon after refusing to let Claude be used in autonomous weapons systems. Multiple U.S. agencies began phasing out Claude models over a six-month transition. The company was labeled a "supply-chain risk" — a designation normally reserved for foreign adversaries. Now that same lab is telling the world that its own model is a cybersecurity risk and choosing to gate access on that basis.

Whether you see this as principled caution or competitive positioning depends on your priors. Either way, it sets a precedent. Mythos is the first time a major lab has publicly said: "We built something too capable to release." That framing will shape how every future frontier model gets rolled out.

GLM-5.1: the model that beat the frontier and costs nothing

While Anthropic was locking things down, Zhipu AI went the other direction.

GLM-5.1 dropped on April 7 under the MIT license — the most permissive license commonly used in open source. 744 billion total parameters, 40 billion active per forward pass via MoE routing. 200K context window. And the number that matters most: it reportedly topped both Claude Opus 4.6 and GPT-5.4 on SWE-Bench Pro, the benchmark that measures real-world software engineering with expert-level tasks.

GLM-5.1

Zhipu AI · April 7 · MIT License

744B
Total params (MoE)
40B
Active per token

Reportedly #1 on SWE-Bench Pro among publicly available models. ~$1/$3.2 per million tokens via API, or free to self-host. MIT = no restrictions on commercial use, modification, or redistribution.

Claude Mythos

Anthropic · April 7 · Gated Preview

$25
Per M input tokens
$125
Per M output tokens

Anthropic's most powerful model. Available only through Project Glasswing to ~50 critical infrastructure partners. No public benchmarks published. Capabilities described as "dramatically higher" than Opus 4.6.

Zhipu AI (Z.ai) has been climbing steadily — GLM-4.5, GLM-4.6, GLM-4.7, GLM-5, and now GLM-5.1. Each iteration has been more capable and more openly licensed. The jump to MIT is significant. Apache 2.0 (Google's choice for Gemma) requires patent grants and attribution. MIT requires almost nothing. Zhipu is saying: take it, use it, build on it, we don't care how.

For developers and companies evaluating coding assistants, the implication is direct. The strongest coding model you can run today — by one credible benchmark — is not behind an API paywall. It's on GitHub. Whether that holds as more independent evaluations come in is an open question, but the signal is hard to ignore.

Gemma 4: Google's best open-source play

Google shipped the Gemma 4 family on April 1 under Apache 2.0. Four variants, each targeting a different deployment tier:

VariantArchitectureTargetModalities
Gemma 4 27BDense, 27B paramsSingle GPU / cloudText + Image + Audio
Gemma 4 26B-A4BMoE, 26B total / 4B activeCost-efficient inferenceText + Image + Audio
Gemma 4 E4BDense, 4B paramsOn-device / edgeText + Image + Audio
Gemma 4 E2BDense, 2B paramsMobile / IoTText + Image + Audio

What sets Gemma 4 apart from previous open-weight releases is the unified multimodal design. All four variants handle text, images, and audio natively — not through separate adapters or bolted-on vision encoders, but as part of the base architecture. The 27B variant scores around 0.8 on GPQA, putting it in the range of models two to three times its size from a year ago.

The edge variants (E2B and E4B) are the ones to watch for practical impact. A 2B parameter model that runs on a phone and handles text, images, and audio is not a research project. It's a product component. If you're building mobile apps, embedded systems, or anything that needs local inference without a cloud round-trip, Gemma 4 E2B is the strongest option available under an open license.

Google calls Gemma 4 "our most capable open models to date." Given that they also have Gemini 3.1 Pro at the top of the closed leaderboard, the strategic read is straightforward: Google is competing on both sides of the open/closed divide simultaneously. That's a position no other lab currently holds.

The rest of the field

Three other releases from the first week of April deserve attention, even if they didn't dominate the headlines.

Alibaba Qwen 3.6-Plus

April 2 · Open · ~$0.28 per million tokens

The Qwen line keeps pushing into agentic territory. Qwen 3.6-Plus ships with a 1M token context window and is specifically tuned for autonomous coding workflows: frontend development, repository-level engineering, terminal agents, and GUI control. At roughly $0.28 per million tokens through Alibaba Cloud, it's priced to be disposable. The kind of model you throw at a long-running agent task and don't worry about the bill.

Zhipu GLM-5V-Turbo

April 1 · Proprietary (API)

Zhipu's first native multimodal coding model. Give it a screenshot or UI mockup, and it generates functional frontend code. Think of it as the bridge between design tools and code editors — a model that sees a design and writes the implementation. Optimized for long-horizon planning and complex coding loops. Works with agent frameworks like OpenClaw.

PrismML Bonsai 8B

April 1 · Open · GGUF format

1-bit quantized. That's 14x smaller than the full-precision version, while retaining strong performance on chat and document tasks. Available in GGUF format on Hugging Face. If you need a model that runs on a Raspberry Pi or a laptop without a dedicated GPU, Bonsai 8B is the answer. The extreme edge of what "local AI" means.

Microsoft MAI Foundation Models

April 2 · Proprietary · Azure / Microsoft Foundry

Three models, none of them text LLMs: MAI-Transcribe-1 (fast multilingual speech-to-text), MAI-Voice-1 (custom voice generation), and MAI-Image-2 (image and video generation). This is Microsoft building the multimodal plumbing — the layer between raw AI capability and production applications. Voice agents, transcription pipelines, media generation. Available now in Microsoft Foundry.

Zooming out: three shifts defining April

The Feb/March sprint produced the frontier: GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6, Llama 4, GLM-5. The Intelligence Index ceiling hit 57.18 and held. April's first week isn't about raising that ceiling. It's about something more consequential.

Three shifts happening right now

Shift 1

Open source isn't catching up. It caught up.

GLM-5.1 beating GPT-5.4 on SWE-Bench Pro is not an anomaly. It's the latest in a pattern that has been accelerating since late 2025. Gemma 4 is competitive with models two to three times its size. The open-weight 45-to-50 band from March — already crowded with MiniMax-M2.7, MiMo-V2-Pro, and Grok 4.20 — is now getting pushed from below by even more efficient open releases. The "open source is 6 months behind" narrative is dead. On specific tasks, it's ahead.

Shift 2

Safety gating is now a release strategy.

Mythos is the first time a major lab has publicly said "we built something too capable to release broadly." Not as a hypothetical. Not in a research paper. As an actual product decision with named partners and pricing. The Anthropic-Pentagon standoff from March — where the company refused military use and was labeled a supply-chain risk — makes this even more significant. A lab that refused to let its models be used offensively is now warning that its own model is an offensive capability. This framing will become standard for frontier releases.

Shift 3

Multimodal is the default. Text-only is the exception.

Every major release this week handles more than text. Gemma 4: text, images, and audio. GLM-5V-Turbo: vision to code. MAI models: speech, voice, image generation. Qwen 3.6-Plus: agentic workflows across terminals and GUIs. Even Mythos is described as spanning reasoning, coding, and cybersecurity — not just "a better chatbot." The pure text LLM, as a product category, is over. Everything shipping now is multimodal by default.

Where this is all headed

A useful frame: February and March were about building the frontier. April is about deciding who gets to stand on it.

The Intelligence Index ceiling (57.18) has held since Gemini 3.1 Pro Preview in February. GPT-5.4 matched it in March. Nobody has broken through. Google, OpenAI, and Anthropic all have models in the pipeline — Mythos being the most visible — but Q1 2026 ended without a clear new #1. The frontier is a plateau, at least temporarily.

What's changing fast is everything below and around the frontier.

Five things to watch in Q2 2026

1.Mythos goes wider. Anthropic will eventually open access beyond the Glasswing 50. When it does, the pricing ($25/$125 per M tokens) will need to come down significantly. The efficiency work they've described will determine whether Mythos is a product or a proof of concept.
2.Self-hosted AI becomes genuinely competitive. GLM-5.1 (MIT), Gemma 4 (Apache 2.0), and Mistral Small 4 from March (Apache 2.0, 6.5B active params) mean that any company with GPU access can now run frontier-adjacent models without a single API call. The cost calculus for build-vs-buy just shifted hard toward build.
3.Every frontier release will need a safety gating plan. Mythos made it explicit, but the pattern applies to anyone building at the frontier. OpenAI, Google, Meta — all of them will face the same question: "Is this model too capable to release broadly?" The answer will increasingly be "yes, initially."
4.Agents replace assistants. Qwen 3.6-Plus, GLM-5V-Turbo, and Google's Gemini 3.1 Flash Live (late March, real-time voice at $0.02/minute) all point the same direction: models that do things, not just answer questions. The agent era isn't coming. It shipped.
5.The real bottleneck is deployment, not intelligence. A recent HUMAN Security report found that AI bots now generate more internet traffic than humans, with automated activity growing 8x faster than human browsing. The question isn't whether AI is smart enough. It's whether infrastructure, governance, and security can keep up with how fast it's being deployed.

The ceiling will break eventually. Someone will cross 60 on the Intelligence Index, probably before summer. When they do, the others will follow within weeks. But the more defining question for Q2 isn't "which model scores highest." It's "who controls the deployment." Mythos answers one way. GLM-5.1 answers the other. Both answers are going to coexist, and the tension between them will shape the industry for years.

Practical guidance for early April 2026

If you need…ConsiderWhy
Best open-source coding modelGLM-5.1#1 on SWE-Bench Pro. MIT license. 744B MoE, 40B active. Free to self-host.
Multimodal on-device modelGemma 4 E2B / E4B2B-4B params. Text + image + audio. Apache 2.0. Runs on phones.
Strong open multimodal modelGemma 4 27BGPQA ~0.8. Text + image + audio. Apache 2.0. Best-in-class for its size.
Cheapest agentic codingQwen 3.6-Plus1M context. ~$0.28/M tokens. Built for agents that write code and navigate UIs.
Vision-to-code pipelineGLM-5V-TurboScreenshot or mockup in, working code out. First native multimodal coding model from Zhipu.
Absolute minimum hardwareBonsai 8B1-bit quantized. 14x size reduction. GGUF. Runs on consumer hardware without GPU.
Cybersecurity vulnerability scanningClaude MythosIf you're one of the Glasswing 50. Otherwise, wait.

The bottom line

Early April 2026 produced the clearest expression yet of AI's central tension. On one side: Anthropic built the most powerful model it has ever created and decided the world isn't ready for it. On the other: Zhipu AI built a model that beats GPT-5.4 on real-world coding and put it on GitHub under MIT.

Google played both sides, shipping Gemma 4 as its best open-weight family while keeping Gemini 3.1 Pro at the top of the closed leaderboard. Alibaba pushed further into agent territory with Qwen 3.6-Plus. Microsoft built the multimodal plumbing. PrismML proved you can run a useful model at 1-bit precision.

The ceiling didn't move. The frontier stayed at 57. But the first week of April made one thing clear: the next phase of AI isn't about who builds the smartest model. It's about who decides how it gets used. That question doesn't have a benchmark. And it won't be settled by a leaderboard.

Data sourced from developer announcements, LLM Stats, PricePerToken, Fortune, VentureBeat, and WhatLLM.org tracking. Covers April 1–8, 2026. See our interactive model explorer for live pricing, speed, and benchmark data across 280+ models.

Frequently asked questions

What is Claude Mythos?

Claude Mythos (internally codenamed Capybara) is Anthropic's newest frontier model, described as a "step change" above Claude Opus 4.6. It excels at reasoning, coding, and cybersecurity vulnerability detection. It is currently available only through Project Glasswing — a gated early-access program limited to ~50 partner organizations. Preview pricing is $25/$125 per million input/output tokens. No public release date has been announced.

What new AI models were released in April 2026?

Key releases in the first week of April 2026: Google Gemma 4 family (four variants, Apache 2.0), Zhipu GLM-5.1 (744B MoE, MIT license) and GLM-5V-Turbo (multimodal coding), Alibaba Qwen 3.6-Plus (agentic coding, 1M context), PrismML Bonsai 8B (1-bit quantized), Microsoft MAI foundational models (speech, voice, image), and Anthropic Claude Mythos (gated preview).

Is GLM-5.1 better than GPT-5.4?

On SWE-Bench Pro (expert-level software engineering), GLM-5.1 reportedly outperforms both GPT-5.4 and Claude Opus 4.6. It ships under the MIT license with 744B total parameters (40B active via MoE). For coding-specific tasks, it appears to lead among publicly available models. Overall general-purpose rankings may differ.

What is Google Gemma 4?

Google Gemma 4 is a family of open-source multimodal models released under Apache 2.0 in April 2026. Four variants: 27B dense, 26B-A4B MoE, and edge-optimized E2B/E4B. All handle text, images, and audio natively. Google calls them "our most capable open models to date."

What is Project Glasswing?

Anthropic's gated early-access program for Claude Mythos. Approximately 50 organizations — AWS, Apple, Microsoft, Google, NVIDIA, Cisco, CrowdStrike, JPMorgan, the Linux Foundation, and others — receive access to use Mythos defensively: scanning their own systems for vulnerabilities before the model is released more broadly.

Cite this analysis

If you are referencing this analysis:

Bristot, D. (2026, April 8). New AI Models April 2026: Anthropic Won't Ship Its Best. Open Source Will. What LLM. https://whatllm.org/blog/new-ai-models-april-2026

Sources: Fortune, VentureBeat, LLM Stats, PricePerToken, Anthropic, Google DeepMind, Zhipu AI, Alibaba Cloud, Microsoft, PrismML announcements, April 2026