Parsing Agentic Offensive Security's Existential Threat

Recorded: May 11, 2026, 1:16 p.m.

Original

Summarized

Parsing Agentic Offensive Security's Existential Threat TechTarget and Informa Tech’s Digital Business Combine.TechTarget and InformaTechTarget and Informa Tech’s Digital Business Combine.Together, we power an unparalleled network of 220+ online properties covering 10,000+ granular topics, serving an audience of 50+ million professionals with original, objective content from trusted sources. We help you gain critical insights and make more informed decisions across your business priorities.Dark Reading Resource LibraryBlack Hat NewsOmdia CybersecurityAdvertiseNewsletter Sign-UpNewsletter Sign-UpCybersecurity TopicsRelated TopicsApplication SecurityCybersecurity CareersCloud SecurityCyber RiskCyberattacks & Data BreachesCybersecurity AnalyticsCybersecurity OperationsData PrivacyEndpoint SecurityICS/OT SecurityIdentity & Access Mgmt SecurityInsider ThreatsIoTMobile SecurityPerimeterPhysical SecurityRemote WorkforceThreat IntelligenceVulnerabilities & ThreatsRecent in Cybersecurity TopicsСloud SecurityHackers Use AI for Exploit Development, Attack AutomationHackers Use AI for Exploit Development, Attack AutomationbyAlexander CulafiMay 11, 20264 Min ReadСloud SecurityAfter Replacing TeamPCP Malware, 'PCPJack' Steals Cloud SecretsAfter Replacing TeamPCP Malware, 'PCPJack' Steals Cloud SecretsbyNate NelsonMay 7, 20265 Min ReadWorld Related TopicsDR GlobalMiddle East & AfricaAsia PacificLatin AmericaSee AllThe EdgeDR TechnologyEventsRelated TopicsUpcoming EventsPodcastsWebinarsSEE ALLResourcesRelated TopicsResource LibraryNewslettersPodcastsReportsVideosWebinarsWhite Papers Partner PerspectivesDark Reading Resource LibraryCyber RiskApplication SecurityVulnerabilities & ThreatsCybersecurity OperationsNewsParsing Agentic Offensive Security's Existential ThreatSome fear frontier LLMs like Claude Mythos and OpenAI's GPT-5.5 will lead to cybersecurity annihilation. Ari Herbert-Voss notes this could be an opportunity.Tara Seals,Managing Editor, News,Dark ReadingApril 27, 20267 Min ReadSource: STOCKFOLIO via Alamy Stock PhotoBLACK HAT ASIA – Singapore – The emergence of large language models (LLM) like Anthropic's Mythos and, this week, OpenAI's GPT-5.5, has set the security world a twitter with dark speculation that we are entering an era of industrialized, autonomous, mass exploitation across any platform or infrastructure — a nuclear threat that no organization, anywhere, can hide from. But not so fast, argues RunSybil CEO Ari Herbert-Voss: while defenders need to change their risk calculus to prepare for ever-accelerating threats from AI, the limits of human effort still matter when it comes to how successful those threats become; and it's a teachable moment for the security industry."What we're seeing with LLMs is what we saw with fuzzers in the 2000s; fuzzing was supposed to change everything," says Herbert-Voss, who was the first security hire at OpenAI, where he led the red team engagements for the GPT3 and Codex model releases. "A non-human could find crashes at scale, quickly, automatically. People thought it would make vuln researchers irrelevant, and trigger a flood of zero-days like the industry had never seen. Some of that happened in small ways, but fuzzing created a new problem, which is a deluge of possible bugs."Related:Research Hub Bridges Cybersecurity Gap for Under-Resourced OrganizationsIn other words, someone still had to sort through the flaws, identify the exploitable crashes, and figure out what caused the bug to be introduced in the first place. "In a way, fuzzing made vuln researchers more valuable," he tells Dark Reading.In the same way, LLMs have the ability to automatically generate massive datasets, confirm something is wrong, and provide ways to offensively exploit that wrongness, he explained during his keynote on Friday at Black Hat Asia in Singapore. But knowing something is wrong and knowing what to do about it are different problems. And this, he says in an interview, highlights areas where human expertise remains not just necessary but crucial, for both attackers and defenders."I've said it once and I will say it again: The capability ceiling is rising fast," he explains. "The capability floor is not keeping pace. Teams can generate more possible bugs than ever before. Validating which ones have real security impact still requires a human. That gap is the problem."Long Way to Go Before Cyberattack ICBMs LaunchAutonomous performance across offensive tasks is improving by leaps and bounds, that much is true, Herber-Voss acknowledged during his talk."One of the most important things that's happening right now is what we call the scaling hypothesis," he said during the keynote. "More [training] data plus more compute power plus more parameters means better performance across a variety of tasks, and this has held surprisingly well over the last seven-plus years. What has happened recently is that capabilities are scaling super-linearly, rather than linearly: When you train a model that is twice as big, for twice as long, on twice as much data, you can get a model that's four times as capable. This is the difference between the last generation of models and this latest generation."Related:Microsoft Edge Stores Passwords in Process Memory, Posing Enterprise RiskIndeed, he points out that between 2023 and 2026, the average time to from discovery of a bug to its exploitation dropped from five months to 10 hours. "'Shifting left' is more important than ever, as it will soon become the case that organizations simply won't be able to ship bugs without those bugs being found and used in short order," he says. "We're seeing this play out in professional capture-the-flag (CTF) competitions already, where challenges that previously took teams hours are now being solved in minutes of going live by CTF players and a couple agentic coding tools."However, LLM-based offensive improvements vary across different classes of vulnerabilities. Mythos has achieved "massive gains" when it comes to finding and exploiting low severity "shallow bugs," he noted in the keynote; modest gains for mid-tier bugs; and relatively sparse gains for the most severe. Humans still need to do a large amount of filtering and validation to reap the benefits of accelerated bug discovery.Related:Physical Cargo Theft Gets a Boost From Cybercriminals"A good example of progress is multistep attack execution: recent evaluations of Anthropic's Mythos by the UK AI Security Institute show models can carry out long offensive workflows autonomously in controlled environments, completing a substantial portion of attack chains," he tells Dark Reading. "This is something earlier models couldn't do. However, the boundary is still clear: These systems are not reliably consistent on real-world targets."In other words, when it comes to meaningfully assessing the impact of a vulnerability, models do not guarantee that those findings are really worth the time, he explained from the stage. "Individual attackers seem to get lucky when they rely on models to find exploits, but many iterations are required if you want to uncover specific impacts on specific targets and topics," he explained. "Recent experiments with Mythos still boiled down to there being 198 human review findings that sit behind a much larger pool of automated data points."In practice though, this still represents a big challenge for organizations. "Defenders are unfortunately going to get hit by millions of monkeys with typewriters, and some of those monkeys will write very good exploits and some won't," he said. "Even so, defenders are going to have to react every time [when bugs are found], whereas attackers will only have to get lucky every few months."Avoiding Mutual Assured (Cyber) DestructionAutonomous offensive systems can now chain exploits, perform reconnaissance, and adapt mid-engagement. “Engineering departments need budget, education, and access to make them AI-native,” Herbert-Voss says. “Figure out what are the things it makes sense for your company/org to build yourselves, and figure out what are the things it makes sense to outsource/buy. However, there is a snake oil problem: Every company claims to be using 'AI' in some fashion with catchy buzz words and promises. Security leaders must hold them accountable for their claims.”In all, there are four key technical advances to lean into as defenders, Herbert-Voss outlines: Improved reasoning. This is the most important underlying part of this, he says: "So much of security involves deep reasoning. How does this work? How could it break? If I do X and Y happens, what does that imply?"Improved tool calling. "You can theorize about ways a system could break all day, but to actually find vulnerabilities, agents need to be able to use tools that let them interact with the real world," Herbert-Voss says. "Agents are now way better at understanding how and when to use what tools to prove vulnerabilities exist."Quality "harness" engineering. "Agents have a limited context window," he explains. "They need to be given access to the right context for the right scope with the right tools. Over time we've continuously refined this to ensure we're setting agents up for success and not expecting them to do the impossible."Building the right systems around the harness. "A single agent with a great harness can only do so much,” he explains. "Success in this industry requires multiple agents working together, and you need to build the right systems to enable effective agent-to-agent communication."In all, the pace of vulnerability discovery by good and bad actors is inevitably going to get faster, and the accessibility of so-called "frontier models" is going to continue to increase. Herbert-Voss believes this is actually a positive development. "There are extreme economic pressures in the AI industry to broaden access to these capabilities, and that is true for both good and bad use cases," he concluded on Friday. "There's a lot of concern over how fast things are moving, and it's something we definitely need to be paying attention to, but I think that there's also just a lot more opportunity to focus on building multilayer defenses and patching, and using this energy and this momentum to do a lot of the things that we probably should have just been doing in the first place."Read more about:Black Hat NewsCISO CornerAbout the AuthorTara SealsManaging Editor, News, Dark ReadingTara Seals is an award-winning journalist with 25+ years of experience as a reporter, analyst, and editor in the cybersecurity, communications, and technology spaces. As managing editor, she runs the newsroom at Dark Reading, leading a team of staff writers and freelance contributors. She also heads up strategy for a variety of in-depth, multichannel news coverage initiatives. Prior to joining Dark Reading in 2022, Tara was editor-in-chief at cybersecurity stalwart Threatpost, and prior to that, the North American news lead for Infosecurity Magazine. She also spent 13 years working for other titles at Virgo Publishing (now part of Informa TechTarget), as executive editor and editor-in-chief at publications focused on communications service providers, channel partners, and enterprise mobile and video technology. In 2026, she was awarded a regional Azbee award for her in-depth coverage of the ongoing North Korean fake worker cyber campaign. A Texas native, she holds a B.A. from Columbia University, lives in Western Massachusetts with her family, and is on a never-ending quest for good Mexican food in the Northeast.See more from Tara SealsWant more Dark Reading stories in your Google search results?Add Us NowMore InsightsIndustry ReportsHow Enterprises Are Developing Secure ApplicationsInside RSAC 2026: security leaders reveal the risks redefining your defense strategyHow Enterprises Are Harnessing Emerging Technologies in CybersecurityDitch the Data Center: Understanding Flexible Cloud Infrastructure Security Management2025 State of MalwareAccess More ResearchWebinarsThe New Attack Surface: How Attackers Are Exploiting OAuth to Own Your Cloud WorkspacePrompt Injection Is Just the Start: Securing LLMs in AI SystemsAnatomy of a Data Breach: What to Do if it Happens to YouHow Well Can You See What's in Your Cloud?Implementing CTEM: Beyond Vulnerability ManagementMore WebinarsEditor's ChoiceThreat IntelligenceFrom Stuxnet to ChatGPT: 20 News Events That Shaped CyberFrom Stuxnet to ChatGPT: 20 News Events That Shaped CyberbyDark Reading Editorial TeamMay 6, 202631 Min ReadCyber RiskPhysical Cargo Theft Gets a Boost From CybercriminalsPhysical Cargo Theft Gets a Boost From CybercriminalsbyRobert LemosMay 4, 20265 Min ReadWant more Dark Reading stories in your Google search results?Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.SubscribeRSAC 2026: key news & insightsAt RSAC 2026, Dark Reading captured critical intelligence on AI, new attack methods, geopolitics, and much moreGet Your RecapWebinarsThe New Attack Surface: How Attackers Are Exploiting OAuth to Own Your Cloud WorkspaceWed, June 24,2026 at 1pm ESTPrompt Injection Is Just the Start: Securing LLMs in AI SystemsTues, May 26, 2026, at 1pm ESTAnatomy of a Data Breach: What to Do if it Happens to YouJune 18th, 2026 | 11:00am -5:00pm ET | Doors Open at 10:30am ETHow Well Can You See What's in Your Cloud?Thurs, June 4, 2026 at 1:00pm ESTImplementing CTEM: Beyond Vulnerability ManagementThurs, May 21, 2026 at 1pm ESTMore WebinarsBlack Hat USA | Mandalay Bay, Las VegasThe premier cybersecurity event of the year returns to Mandalay Bay with a re‑engineered, six‑day program built to ignite innovation, push boundaries, and bring the global security community together like never before. Use code: DARKREADING to save $200 on a Briefings pass or $100 on a Business pass.GET YOUR PASSDiscover MoreBlack HatOmdiaWorking With UsAbout UsAdvertiseReprintsJoin UsNewsletter Sign-UpFollow UsCopyright © 2026 TechTarget, Inc. d/b/a Informa TechTarget. This website is owned and operated by Informa TechTarget, part of a global network that informs, influences and connects the world’s technology buyers and sellers. All copyright resides with them. Informa PLC’s registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. TechTarget, Inc.’s registered office is 275 Grove St. Newton, MA 02466.Home|Cookie Policy|Privacy|Terms of UseYour Privacy Choices

Ari Herbert-Voss of RunSybil argues that the emergence of advanced large language models, such as Mythos and GPT-5.5, doesn’t necessarily herald an era of complete cybersecurity annihilation, but rather presents a new challenge. He contends that while these models can accelerate attack processes and generate vast amounts of potential vulnerabilities, human expertise remains crucial for validating those findings and translating them into actionable exploits. Herbert-Voss draws an analogy to the impact of fuzzing in the early 2000s, where automated tools identified numerous bugs, but human researchers were still needed to analyze and prioritize them. He highlights the “capability ceiling” – the gap between the speed of AI-driven discovery and the ability to effectively assess and utilize those discoveries. Specifically, he notes that LLMs currently excel at finding low-severity “shallow bugs,” with modest gains for mid-tier vulnerabilities and limited success with higher-impact threats. The key difference is the requirement for human review to filter the massive output of these models and determine their true significance. Herbert-Voss’s argument rests on observations about the scaling hypothesis—that increased training data and compute power yield exponential improvements in AI capabilities—and recent advancements in automation of attack chains through LLMs like Mythos. He emphasizes the importance of ‘shifting left’ to proactively address discovered vulnerabilities, and acknowledges the evolving role of tools to support agent-based security systems. He stresses four key technical advances defenders must embrace: improved reasoning, enhanced tool calling capabilities, quality harness engineering, and building the right systems to enable agent-to-agent communication. Ultimately, Herbert-Voss believes that while the pace of vulnerability discovery will continue to accelerate, a layered defense strategy and a focus on patching are still essential for mitigating risk.