
When AI gets smarter, it's only a matter of time before it becomes "too" smart and starts to feel scary.
When OpenAI launched ChatGPT, it marked the start of what quickly became known as the LLM war. Overnight, large language models went from research curiosities to consumer products that could write essays, debug code, and hold coherent conversations. Tech companies poured billions into scaling up models, adding features like vision, agents, and longer context windows.
Progress accelerated at a pace few had anticipated, with each new release promising bigger benchmarks and broader capabilities. Yet the race also amplified longstanding worries about safety, misuse, and unintended consequences.
Models grew powerful enough to automate complex tasks, but questions lingered about who would control them and what safeguards were truly sufficient.
Anthropic entered this fray with a different posture. The company has consistently emphasized alignment and responsible deployment. Its models, built around principles like Constitutional AI, have aimed to be helpful without being overly eager to assist in harmful activities. While still competing on raw capability, Anthropic has often chosen measured rollouts and internal red-teaming over the fastest possible commercialization.
This approach has shaped how the company handles breakthroughs that carry clear dual-use risks.
That stance is on display in 'Project Glasswing.'
Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software.
It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans.https://t.co/NQ7IfEtYk7— Anthropic (@AnthropicAI) April 7, 2026
The initiative Anthropic is literally a coalition of major technology and infrastructure organizations, like Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and more than 40 additional entities responsible for critical software.
The idea of the project, is to have them use Anthropic's most advanced AI to find and fix vulnerabilities before they can be exploited at scale.
At its core is Claude Mythos Preview, Anthropic's newest frontier model, which remains unreleased to the public.
Mythos Preview has already found thousands of high-severity vulnerabilities—including some in every major operating system and web browser. pic.twitter.com/YuW484PVrr
— Anthropic (@AnthropicAI) April 7, 2026
The decision not to release Mythos Preview broadly stems directly from what the model can do.
Anthropic describes it as its most capable system yet for coding and agentic tasks. What this means, it can plan, reason through multi-step problems, read and modify large codebases, and operate autonomously in simulated environments. In internal testing and red-teaming, the model identified thousands of high-severity zero-day vulnerabilities across every major operating system and web browser, as well as other foundational software.
Some of these flaws had persisted for more than a decade, surviving millions of automated tests and repeated human scrutiny.
Examples, as listed by Anthropic, include a 27-year-old remote-crash vulnerability in OpenBSD and a 16-year-old issue in the widely used FFmpeg library. The model did not simply flag problems; it generated working exploit chains that demonstrated real impact.
Mythos can do this because the model goes beyond what Opus 4.6 can do.
"Last month, we wrote that 'Opus 4.6 is currently far better at identifying and fixing vulnerabilities than at exploiting them.'," said Anthropic. But training Mythos for coding apparently made it superior in all fields than Opus 4.6.
In other words, Mythos Preview operates on an entirely different level.
For example, according to Anthropic, Opus 4.6 was able to turn discovered vulnerabilities in Mozilla’s Firefox 147 JavaScript engine (all of which were patched in Firefox 148) into working JavaScript shell exploits only twice across several hundred attempts. When the same experiment was repeated using Mythos Preview, the results were dramatically different: the model successfully generated working exploits 181 times and achieved register control in an additional 29 cases.
The comparison underscores not just incremental improvement, but a step-change in capability: one that significantly lowers the barrier between identifying a vulnerability and actively exploiting it.
Read: Leaked Anthropic Data Reveals 'Mythos' And 'Capybara': Next-Generation AI Model With Safety Concerns
We do not plan to make Mythos Preview generally available. Our goal is to deploy Mythos-class models safely at scale, but first we need safeguards that reliably block their most dangerous outputs.
We’ll begin testing those safeguards with an upcoming Claude Opus model.— Anthropic (@AnthropicAI) April 7, 2026
These results illustrate a broader shift in cybersecurity.
For decades, discovering exploitable bugs required rare expertise, painstaking manual analysis, and significant resources. Frontier AI models have already begun lowering those barriers. Mythos Preview represents a further leap: it performs at levels that, on specialized benchmarks like SWE-bench and Terminal-Bench, substantially exceed earlier Claude versions and approach or surpass what all but the most skilled human experts can achieve.
Anthropic concluded that making such a tool generally available would hand offensive actors, like state-sponsored groups, cybercriminals, or others, an unprecedented ability to scan, weaponize, and automate attacks against the software that underpins banking, healthcare, power grids, logistics, and national infrastructure.
The potential fallout, the company argues, is too severe to risk.
This is precisely why Mythos is so scary, and why Anthropic has chosen not to release it to the public.
So instead of open release, Project Glasswing channels the model's power toward defense.
Participating organizations receive private access through the Claude API and cloud platforms like Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. They are using it on their own systems and open-source components to scan for weaknesses, simulate attacks, and generate patches.
Anthropic is providing up to $100 million in usage credits to support this work and an additional $4 million in direct donations to open-source security efforts, including the OpenSSF and Apache Software Foundation.
The goal is not only to harden immediate targets but to surface patterns and best practices that can be shared industry-wide. Within 90 days, Anthropic plans to publish a public report detailing what was learned, which vulnerabilities were addressed, and any recommended changes to vulnerability disclosure, patching processes, and secure development lifecycles.
Project Glasswing is just a starting point.
No organization can solve these cybersecurity problems alone: industry, open source, researchers, and governments all have essential roles to play.— Anthropic (@AnthropicAI) April 7, 2026
Early feedback from partners underscores both the model’s effectiveness and the urgency.
Security leads at Cisco, AWS, Microsoft, CrowdStrike, Google, Palo Alto Networks, and the Linux Foundation have described the tool as crossing a threshold that demands new defensive strategies. They note that the window between discovery and exploitation has narrowed dramatically; what once took teams weeks or months could soon be automated in hours.
By giving defenders early access, the project aims to close that gap before similar capabilities reach adversarial hands. Participation is limited to organizations with defensive mandates, and access is governed by strict terms focused on security research and remediation.
Project Glasswing is framed as a starting point rather than a complete solution.
The initiative acknowledges that securing the internet’s foundational software will require sustained collaboration among AI developers, platform owners, open-source maintainers, researchers, and governments. AI capabilities are advancing rapidly, and the project's partners expect the work to continue for months or years. Anthropic has indicated openness to expanding the coalition and has begun discussions with U.S. officials about the national-security implications.
In the longer term, the company hopes the effort will inform broader standards for responsible AI deployment in high-stakes domains.
The Claude Mythos Preview system card is available here: https://t.co/TMtIy8xHiP
— Anthropic (@AnthropicAI) April 7, 2026
In many ways, Glasswing reflects the tension at the heart of the current AI era.
Models are reaching levels of competence that can reshape entire fields sometimes in ways that benefit society, sometimes in ways that threaten it. The choice to withhold a powerful tool from general use while directing it toward collective defense is a deliberate one, consistent with Anthropic's broader emphasis on caution.
What comes can define how the industry handles capabilities that blur the line between breakthrough and risk. If systems like Mythos become widely accessible without robust safeguards, the balance could quickly tip toward exploitation. But if efforts like Project Glasswing succeed, they may offer a blueprint for how frontier AI can be deployed responsibly, prioritizing resilience over speed, and coordination over competition.
Ultimately, Glasswing is less about a single model and more about a shift in mindset.
As AI systems grow more powerful, the question is no longer just what they can do, but who gets to use them, and under what constraints. The answer to that question may shape not only the future of cybersecurity, but the trajectory of AI itself.