AI models that once produced mostly hallucinations are increasingly effective at spotting security flaws in widely used software, a shift that could both harden defenses and empower attackers.
Anthropic this week unveiled Mythos Preview, a model the company says “could reshape cybersecurity,” claiming it found high-severity vulnerabilities across major operating systems and web browsers and suggested ways to exploit them. Anthropic is restricting Mythos Preview to about 50 selected companies and organizations through a collaboration called Project Glasswing, saying the risk of misuse is too great to release this model publicly. The company plans to release other related models and ultimately wants users to deploy Mythos-class models safely at scale.
Security professionals say the immediate worry is for cybersecurity teams rather than everyday users. Daniel Blackford, VP of Threat Research at Proofpoint, notes ordinary people remain most at risk from routine mistakes like password theft. Still, the new capabilities could speed up both defensive and offensive work: better bug finding helps developers fix problems but can also lower the barrier for attackers to build exploits.
Open-source infrastructure maintainers are already experimenting with these models. Jim Zemlin, CEO of the Linux Foundation—which hosts the Linux kernel used in Android, supercomputers, and many servers—said kernel maintainers involved in Project Glasswing have started testing Mythos to see how to use it effectively. He argued the model could materially ease the burden on maintainers who are often overworked.
The shift toward more useful AI-driven security research became visible in early 2026, according to Daniel Stenberg, lead developer of cURL, a 30-year-old open-source data transfer tool used in cars, medical devices and countless internet services. In 2025 cURL’s security reporting was overwhelmed by low-quality, likely AI-generated submissions; Stenberg received 185 reports that year but found fewer true security issues, and he stopped paying bug bounties because many reports were bogus and time-consuming. He describes AI-generated reports as verbose and elaborate—hundreds of lines where a human would use far fewer words.
A 2025 HackerOne survey found nearly 60% of respondents were using, learning, or studying how to audit AI or machine-learning systems. By 2026 the quality of submissions had changed: the volume remained high, but legitimate findings increased. Stenberg estimates about one in ten reports now flag security vulnerabilities, with many other reports identifying real bugs. In the first three months of 2026, his team fixed more vulnerabilities than in either of the previous two years. He also uses AI himself; a single click has surfaced over 100 issues that had gone through human review and static analysis.
Linux kernel maintainers have reported similar improvements, and researchers such as Nicholas Carlini have demonstrated finding kernel vulnerabilities with earlier Anthropic models. Alex Stamos, chief security officer at Corridor, said large language models (LLMs) “have now bypassed human capability for bug finding.” The release of stronger commercial models late in 2025 and into 2026 appears to have driven that leap in quality.
Finding bugs is only the first stage in an attack chain. Stamos and other experts stress that discovering a flaw is not the same as weaponizing it. Turning a bug into a reliable exploit takes additional work and judgment; many leading foundation models from labs like Anthropic, OpenAI and DeepMind include guardrails intended to prevent generating exploit code.
The greater risk, experts warn, is when open-weight models—publicly accessible models whose internal parameters are available—catch up to closed, proprietary models. Those could be copied, modified to remove safeguards, and used to do both discovery and exploit generation. The most advanced open-weight models are reportedly less than a year behind the top closed models, raising concerns that the defensive advantage could erode if dangerous capabilities become widely available.
There is debate about whether AI can reliably fix vulnerabilities once found. Stenberg believes AI is better at detection than at making the judgment calls and contextual decisions required to implement safe, correct fixes. The process of agreeing on the problem, deciding how to remediate in the context of broader software design, and validating changes involves human coordination that AI does not yet replace. Others, including the security platform HackerOne, are building more agentic AI tools aimed at both finding and patching security issues.
The appearance of those capabilities also touches politics and procurement. The U.S. Department of Defense labeled Anthropic a “supply chain risk” after the company urged governments not to use its tech for autonomous weapons and mass surveillance; that designation would restrict government and contractor use of Anthropic products and is under legal dispute. Some security experts say Anthropic’s decision to limit Mythos Preview’s release gives developers and national security teams time to strengthen defenses.
For now, the cybersecurity community faces a mix of opportunity and strain: AI is producing higher-quality reports and surfacing previously missed issues, but maintainers—often understaffed—must process more findings and make careful decisions about fixes. The coming months and years will test whether the defensive benefits of AI can outpace the risks of misuse as more powerful models proliferate.