Anthropic's Mythos: AI Model Too Dangerous to Release

By Artūras Malašauskas Apr 21, 2026 3 min read Share:

Anthropic is testing 'Mythos,' its most powerful AI model, with limited access for cybersecurity testing amid concerns over its unprecedented vulnerability-finding capabilities.

AI company Anthropic is developing and testing a new model described as "the most capable we've built to date," according to a spokesperson, following a data leak that revealed the model's existence. The model, code-named Mythos, represents a "step change" in AI performance and is currently being trialed by "early access customers" through a cybersecurity-focused initiative called Project Glasswing, per Fortune's reporting.

The model's existence was inadvertently exposed when Anthropic left draft documentation in an unsecured public data cache, which included a blog post describing Mythos as "by far the most powerful AI model we've ever developed." The draft also referenced a new model tier called Capybara, described as "larger and more intelligent than our Opus models—which were, until now, our most powerful." Anthropic later acknowledged a "human error" in its content management system configuration, calling the leaked material "early drafts of content considered for publication."

Anthropic's Mythos model has demonstrated unprecedented cybersecurity capabilities during internal testing. According to a 245-page technical document released by the company, Mythos outperformed its predecessor Claude Opus 4.6 by 31 percentage points on the USAMO 2026 Mathematical Olympiad, a proof-based competition. Independent testing by the U.K.'s AI Security Institute (AISI) found the model succeeded in expert-level hacking tasks 73% of the time—up from zero success rates for prior models—though testers noted the AI faced simplified defenses not representative of real-world systems.

Unlike Anthropic's publicly available models, Mythos is not being released broadly. Instead, the company has limited access to a consortium of major technology firms including Microsoft, Google, Apple, Amazon Web Services, JPMorgan Chase, and Nvidia through Project Glasswing. The initiative aims to use the model to identify and patch security vulnerabilities before they can be exploited, with Anthropic stating it can "autonomously infiltrate computer systems around the world" to find zero-day vulnerabilities in operating systems and browsers.

The decision to withhold Mythos marks a significant shift in AI development strategy. As Scientific American notes, Anthropic's approach—refusing to release a model deemed "too dangerous for the public"—echoes OpenAI's temporary withholding of GPT-2 in 2019. The company has instead released a less capable model, Claude Opus 4.7, with "safeguards that automatically detect and block requests indicating prohibited or high-risk cybersecurity uses," while emphasizing that Mythos represents "the biggest single leap in model performance in three years" on the Epoch Capabilities Index.

Cybersecurity experts remain divided on the model's implications. While Peter Swire, a cybersecurity professor at Georgia Tech, notes that "a large fraction of cybersecurity professors believe this is pretty much what was expected," others like Ciaran Martin of Oxford's Blavatnik School acknowledge it's "a big deal" but "unlikely to prove to be the end of the world." The U.K.'s National Cyber Security Center has reportedly intensified AI risk testing following Mythos's announcement, with German banks and the Bank of England also consulting on potential threats.

Anthropic's strategy reflects a broader industry shift from "demand scarcity" to "supply scarcity," where companies are spending hundreds of thousands of dollars monthly on AI agents while hyperscalers struggle to meet demand. The company has not committed to a public release timeline for Mythos but states its goal is to "learn how it could eventually deploy Mythos-class models at scale" through Project Glasswing's defensive cybersecurity applications.

For now, Mythos remains a tightly controlled experiment, with Anthropic positioning it as a tool to "secure the world's most critical software" rather than a commercial product. The model's capabilities—particularly its ability to autonomously identify and exploit vulnerabilities in widely used software—highlight the growing tension between AI's accelerating capabilities and the industry's capacity to manage associated risks.

Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt Connect on LinkedIn

Anthropic's Mythos: AI Model Too Dangerous to Release

Comments