'TrustFall' Convention Exposes Claude Code Execution Risk

Recorded: May 11, 2026, 1:16 p.m.

Original

Summarized

'TrustFall' Exposes Claude Code Execution Risk TechTarget and Informa Tech’s Digital Business Combine.TechTarget and InformaTechTarget and Informa Tech’s Digital Business Combine.Together, we power an unparalleled network of 220+ online properties covering 10,000+ granular topics, serving an audience of 50+ million professionals with original, objective content from trusted sources. We help you gain critical insights and make more informed decisions across your business priorities.Dark Reading Resource LibraryBlack Hat NewsOmdia CybersecurityAdvertiseNewsletter Sign-UpNewsletter Sign-UpCybersecurity TopicsRelated TopicsApplication SecurityCybersecurity CareersCloud SecurityCyber RiskCyberattacks & Data BreachesCybersecurity AnalyticsCybersecurity OperationsData PrivacyEndpoint SecurityICS/OT SecurityIdentity & Access Mgmt SecurityInsider ThreatsIoTMobile SecurityPerimeterPhysical SecurityRemote WorkforceThreat IntelligenceVulnerabilities & ThreatsRecent in Cybersecurity TopicsСloud SecurityHackers Use AI for Exploit Development, Attack AutomationHackers Use AI for Exploit Development, Attack AutomationbyAlexander CulafiMay 11, 20264 Min ReadСloud SecurityAfter Replacing TeamPCP Malware, 'PCPJack' Steals Cloud SecretsAfter Replacing TeamPCP Malware, 'PCPJack' Steals Cloud SecretsbyNate NelsonMay 7, 20265 Min ReadWorld Related TopicsDR GlobalMiddle East & AfricaAsia PacificLatin AmericaSee AllThe EdgeDR TechnologyEventsRelated TopicsUpcoming EventsPodcastsWebinarsSEE ALLResourcesRelated TopicsResource LibraryNewslettersPodcastsReportsVideosWebinarsWhite Papers Partner PerspectivesDark Reading Resource LibraryApplication SecurityCyber RiskThreat IntelligenceVulnerabilities & ThreatsNews'TrustFall' Convention Exposes Claude Code Execution RiskMalicious repositories can trigger code execution in Claude Code, Cursor CLI, Gemini CLI, and CoPilot CLI with minimal or no user interaction, thanks to skimpy warning dialogs.Jai Vijayan,Contributing WriterMay 7, 20266 Min ReadSource: Samuel Boivin via ShutterstockDevelopers using the latest versions of AI coding tools like Claude Code, Cursor CLI, Gemini CLI, and CoPilot CLI could inadvertently execute malicious code on their systems with a single keypress, or no keypress at all in continuous integration environments.That, according to researchers at Adversa AI, is because none adequately warn users of how a malicious repo can auto-approve and spawn a Model Context Protocol (MCP) server without their explicit approval or knowledge. All four coding tools show some form of a trust dialog prompting the user to indicate whether they trust a particular repo, but they do not offer full details on what that consent might actually entail.Adversa AI identified Claude Code as offering the least information in its trust dialog, and Gemini AI as offering the most, along with a choice in terms of allowing or disallowing an MCP server to execute on the developer's system. But the exposure is the same in all four, according to Adversa's lead researcher, Rony Utevsky.Related:Reverse Engineering With AI Unearths High-Severity GitHub Bug"A repository can ship a configuration that auto-approves and immediately launches an MCP server, no tool call from the agent is required," he tells Dark Reading. "The variation is purely in how clearly the dialog tells the user what they are consenting to."Anthropic itself however has described the issue that Adversa AI identified as existing outside its threat model, and it told Adversa AI that it believes its trust dialog offers sufficient warning to users. Anthropic pointed to how any malicious activity happens only after the user has allowed a repo/folder to be trusted or safe, Utevsky says, adding that Adversa AI has not raised the issue with the other AI coding toolmakers because Anthropic's approach appears to be the general convention. "Once we identified the issue as a class-level convention rather than a vendor bug, vendor-specific disclosure stopped being the right shape of response: you can responsibly disclose a vulnerability to a vendor, but not a convention," he explains.A Straightforward Path?According to Adversa AI, all a threat actor would need to do to pull off an attack is create a repository that includes a malicious MCP server and configuration settings that auto-approve it to run. When a developer clones or opens the repo in the AI coding tool and presses "enter" on what appears to be a routine security check, the AI coding tool unwittingly launches the attacker-controlled code with the developer's full system privileges and no further prompting. Related:Fresh Wave of GlassWorm VS Code Extensions Slices Through Supply ChainThe payload can vary, and can allow attackers to read local files, including secrets, SSH keys, and tokens; access other projects; install backdoors; and establish a command-and-control connection. In a CI/CD environment, the same attack would unfold with no human interaction at all."The impact is full-machine compromise, not just project access," researchers at Adversa AI said in a report this week that focused on attacks using Claude Code. "MCP servers execute as native OS processes with the full privileges of the user running Claude Code." That means they aren't sandboxed or confined in any way. "The payload runs the moment the MCP server process starts," they added.A Risky Change to the Trust Dialog in Claude CodeThe report points to a trust dialog change that Anthropic introduced in Claude Code version 2.1, which removed warning language that previously made the risk more visible to users. That change has turned a routine developer action of cloning or reviewing a repo into a high-risk action, Utevsky says."The dialog users see is a simple 'Yes, I trust this folder,'" he explains. "Most developers don't realize 'trusting' hands over that much power." In contrast, earlier versions of Claude Code prior to 2.1 warned about MCP execution explicitly, and offered an option to proceed with MCP servers disabled. Both are no longer present, Utevsky says.Related:Vercel Employee's AI Tool Access Led to Data BreachThe security researcher says the TrustFall issue joins three exploitable vulnerabilities in Claude Code that could allow a malicious repository to abuse project-scoped settings to silently change how the tool behaves on a developer's machine. The other three vulnerabilities include CVE-2025-59536, CVE-2026-21852, and CVE-2026-33068, all of which Anthropic has patched.Adversa AI also identified three configuration settings that an attacker could use in their malicious repos to trigger arbitrary code execution on a developer's system, without an explicit prior warning from Claude Code. One of them uses a setting that would automatically approve a malicious MCP server to run the moment the user accepts Claude Code's broad folder trust prompt. The second involves planting the payload directly in the configuration file making it harder for security scanners to flag, and the third pre-authorizes specific tool calls through project settings, enabling code execution without further user interaction."In our opinion, the language of the new warning dialog downplays the decision's importance and the severity of the consequences, while providing no information about the project contents," Utevsky says. "It also defaults to 'trust,' so a reflexive press of 'enter' leads to unsafe behavior."Claude Code's handling of dangerous settings is also internally inconsistent, he believes. Other configuration settings, such as bypassPermissions, invoke a much more alarming dialog with stronger language, and it defaults to "No, exit." "The same product treats less dangerous settings more carefully than this one," Utevsky says.Not a Vulnerability, But Developers Still Need DefensesAnthropic's position is that unlike previous vulnerabilities that allowed malicious code execution before a trust dialog even appeared, the issue that Adversa AI has identified involves code execution that happens only after the user has consented to the project. "Whether this meets Anthropic's threshold for a vulnerability is their call," the security vendor noted in its report. "Whether users are making an informed trust decision under the v2.1+ dialog, in our view, is not a close question. They are not."Reducing exposure to the AI agent threats like these, according to Adversa AI, boils down to tightening controls across developer endpoints and CI/CD pipelines, and bolstering overall visibility into how tools like Claude Code are used.On developer systems, organizations should focus on inspecting project configurations and monitoring for unexpected behavior when new repositories are opened. Organizations should make sure they validate projects and use behavioral monitoring to detect unusual processes or activity initiated by development tools In CI environments, the most effective safeguard is to avoid running the tool automatically on untrusted code, Adversa said. "Inspecting repo settings, automation actions, and project scaffolding isn't technically complex, but it takes time and discipline," Utevsky says. "It's also unavoidable now, given how common supply chain attacks and intentionally malicious open source packages have become."Don't miss the latest Dark Reading Confidential podcast, How the Story of a USB Penetration Test Went Viral. Two decades ago Dark Reading posted its first blockbuster piece — a column by a pen tester who sprinkled rigged thumb drives around a credit union parking lot and let curious employees do the rest. This episode looks back at the history-making piece with its author, Steve Stasiukonis. Listen now!About the AuthorJai VijayanContributing WriterJai Vijayan is a seasoned technology reporter with over 20 years of experience in IT trade journalism. He was most recently a Senior Editor at Computerworld, where he covered information security and data privacy issues for the publication. Over the course of his 20-year career at Computerworld, Jai also covered a variety of other technology topics, including big data, Hadoop, Internet of Things, e-voting, and data analytics. Prior to Computerworld, Jai covered technology issues for The Economic Times in Bangalore, India. Jai has a Master's degree in Statistics and lives in Naperville, Ill.See more from Jai VijayanWant more Dark Reading stories in your Google search results?Add Us NowMore InsightsIndustry ReportsHow Enterprises Are Developing Secure ApplicationsInside RSAC 2026: security leaders reveal the risks redefining your defense strategyHow Enterprises Are Harnessing Emerging Technologies in CybersecurityDitch the Data Center: Understanding Flexible Cloud Infrastructure Security Management2025 State of MalwareAccess More ResearchWebinarsThe New Attack Surface: How Attackers Are Exploiting OAuth to Own Your Cloud WorkspacePrompt Injection Is Just the Start: Securing LLMs in AI SystemsAnatomy of a Data Breach: What to Do if it Happens to YouHow Well Can You See What's in Your Cloud?Implementing CTEM: Beyond Vulnerability ManagementMore WebinarsEditor's ChoiceThreat IntelligenceFrom Stuxnet to ChatGPT: 20 News Events That Shaped CyberFrom Stuxnet to ChatGPT: 20 News Events That Shaped CyberbyDark Reading Editorial TeamMay 6, 202631 Min ReadCyber RiskPhysical Cargo Theft Gets a Boost From CybercriminalsPhysical Cargo Theft Gets a Boost From CybercriminalsbyRobert LemosMay 4, 20265 Min ReadWant more Dark Reading stories in your Google search results?Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.SubscribeRSAC 2026: key news & insightsAt RSAC 2026, Dark Reading captured critical intelligence on AI, new attack methods, geopolitics, and much moreGet Your RecapWebinarsThe New Attack Surface: How Attackers Are Exploiting OAuth to Own Your Cloud WorkspaceWed, June 24,2026 at 1pm ESTPrompt Injection Is Just the Start: Securing LLMs in AI SystemsTues, May 26, 2026, at 1pm ESTAnatomy of a Data Breach: What to Do if it Happens to YouJune 18th, 2026 | 11:00am -5:00pm ET | Doors Open at 10:30am ETHow Well Can You See What's in Your Cloud?Thurs, June 4, 2026 at 1:00pm ESTImplementing CTEM: Beyond Vulnerability ManagementThurs, May 21, 2026 at 1pm ESTMore WebinarsBlack Hat USA | Mandalay Bay, Las VegasThe premier cybersecurity event of the year returns to Mandalay Bay with a re‑engineered, six‑day program built to ignite innovation, push boundaries, and bring the global security community together like never before. Use code: DARKREADING to save $200 on a Briefings pass or $100 on a Business pass.GET YOUR PASSDiscover MoreBlack HatOmdiaWorking With UsAbout UsAdvertiseReprintsJoin UsNewsletter Sign-UpFollow UsCopyright © 2026 TechTarget, Inc. d/b/a Informa TechTarget. This website is owned and operated by Informa TechTarget, part of a global network that informs, influences and connects the world’s technology buyers and sellers. All copyright resides with them. Informa PLC’s registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. TechTarget, Inc.’s registered office is 275 Grove St. Newton, MA 02466.Home|Cookie Policy|Privacy|Terms of UseYour Privacy Choices

Anthropic’s Claude Code, Cursor CLI, Gemini CLI, and CoPilot CLI are susceptible to malicious code execution via repositories due to a lack of sufficient warning dialogs regarding the trust of repository contents, as highlighted by Adversa AI. The core issue centers on the Model Context Protocol (MCP) server auto-approval process, where developers are unwittingly granting permissions to attacker-controlled code without explicit notification or control. While Anthropic describes the issue as outside its initial threat model and states the trust dialog offers sufficient warning, Rony Utevsky of Adversa AI argues the dialog’s language downplays the risk and defaults to ‘trust,’ leading to potentially dangerous behavior when developers simply press “enter.” A key shift in Claude Code version 2.1 removed explicit warning language about MCP execution, transforming a routine developer action into a high-risk operation. Adversa AI’s research demonstrates that a threat actor could create a repository with a malicious MCP server and configuration settings, allowing for full-machine compromise through simple execution of the approved code, bypassing conventional security channels. The payload’s capabilities range from accessing local files and SSH keys to establishing command-and-control connections, particularly potent in CI/CD environments where human interaction is absent. Furthermore, the vulnerability extends beyond the initial trust dialog, with three configuration settings enabling arbitrary code execution without prior warnings, including a mechanism to automatically approve malicious MCP servers and another that pre-authorizes specific tool calls. This multifaceted attack surface underscores a critical risk for developers utilizing these AI coding tools. Adversa AI emphasizes the need for tightened controls across developer endpoints and CI/CD pipelines, along with enhanced visibility into tool usage to mitigate the threat. The change in Claude Code’s trust dialog, effectively reducing the level of user awareness, has amplified the vulnerability’s impact.