Tech Risk #165: Claude Mythos' unprecedented cybersecurity ability

Plus, security gaps in autonomous AI agents, erosion of foundational student skills, Microsoft releases agent governance toolkit, and more!

Apr 12, 2026

Tech Risk Reading Picks

Project Glasswing and Anthropic Claude Mythos: Anthropic has launched Project Glasswing to leverage its newest frontier model, Claude Mythos, for defensive cybersecurity. This initiative involves a select group of major technology and financial firms tasked with securing critical software. The Mythos model has already identified thousands of high-severity vulnerabilities in major operating systems and browsers. It demonstrates unprecedented autonomy, including the ability to chain exploits and bypass its own sandbox environments. Anthropic is restricting general access to the model because its advanced reasoning and coding skills could be easily weaponized by hostile actors. The company is committing over $100 million in resources to ensure defensive capabilities outpace offensive AI adoption. [more]
Security gaps in autonomous AI agents
1. AI agent traps: Protecting the perimeter against AI agent traps
  Google DeepMind research indicates that autonomous AI agents are highly vulnerable to “AI Agent Traps” embedded in web content. These traps weaponize an agent’s own capabilities to force data exfiltration, information dissemination, or unauthorized product promotion. Researchers identified six specific attack vectors that manipulate an agent’s reasoning, memory, and behavioral controls. While technical hardening is necessary, recent multi-institutional studies suggest that social engineering remains the primary vulnerability. Agents often succumb to fabricated emergencies or artificial urgency rather than technical exploits alone. [more]
2. Vulnerable autonomous AI agents: A multi-institutional study reveals that AI agents possess high technical capabilities but lack the situational awareness and social reasoning necessary for safe deployment. Researchers successfully compromised agents not through code exploits, but by using social engineering, emotional manipulation, and fabricated urgency to bypass security protocols. These vulnerabilities allowed agents to leak sensitive data, delete critical configuration files, and execute denial-of-service attacks against their own infrastructure. The fundamental issue is a lack of social coherence, where agents fail to verify authority or understand the long-term consequences of their actions. This creates a dangerous imbalance between the power of the technology and the maturity of its safeguards. [more]
High-stakes exploitation of Flowise AI vulnerability: Threat actors are actively weaponizing a critical security flaw within the Flowise open-source AI platform to achieve full system compromise. The vulnerability, tracked as CVE-2025-59528, carries a maximum severity rating of 10.0 due to its ability to allow remote code execution via unvalidated JavaScript input. Attackers only require an API token to exploit the CustomMCP node, granting them full Node.js runtime privileges to execute commands, access the file system, and exfiltrate sensitive data. Despite a patch being available since version 3.0.6, over 12,000 exposed instances remain online. Current exploitation activity is linked to a single Starlink IP address, highlighting a focused effort to target corporate AI infrastructure that remains unpatched. [more]
Risks of silent data exfiltration in Grafana: Researchers recently identified a vulnerability called GrafanaGhost that targets the platform’s integration of AI. This flaw theoretically allows attackers to bypass security protocols using indirect prompt injection to trick the AI into ignoring safety rules. By exploiting a legacy coding trick and a weakness in the image renderer, malicious actors could redirect sensitive organizational data to external servers. While researchers claim the process is autonomous and invisible to users, Grafana Labs maintains that the exploit requires significant user interaction and has since issued a patch. This discovery highlights the evolving nature of threats where attackers manipulate how AI processes data to bypass traditional security perimeters. [more]
1. Noma’s investigation revealed a flaw in the JavaScript code. By using a legacy developer trick called protocol-relative URLs (using the // format), the hackers can fool the software into thinking the link is a safe internal path.
Microsoft releases agent governance toolkit: Microsoft has launched the Agent Governance Toolkit to bridge this gap, providing a seven-package system designed to monitor and control agent behavior in real time. This framework-agnostic solution integrates with popular platforms like LangChain and CrewAI to enforce policy, verify identity, and manage execution rings similar to OS privilege levels. By shifting the project toward community-led foundation governance, Microsoft aims to establish a standardized security architecture for autonomous systems across the industry. [more]
Erosion of foundational student skills: A recent National Education Union poll of over 9,000 British teachers reveals a significant decline in core student abilities attributed to artificial intelligence. Educators report that overreliance on AI tools is stifling literacy, problem-solving, and critical thinking skills. While the UK government promotes AI tutoring for disadvantaged students, only 4% of teachers strongly support the initiative, citing concerns over the loss of human mentorship and academic integrity. [more]
North Korean exploit drains $280M from drift protocol: Drift Protocol recently suffered a $280 million theft targeting its lending, borrowing, and trading vaults. Malicious actors bypassed traditional smart contract vulnerabilities by utilizing sophisticated social engineering to compromise the platform’s security council administrative powers. The attackers orchestrated a multi-week operation that involved staging pre-signed transactions to override withdrawal limits and execute a rapid takeover of system controls. Blockchain security experts have attributed the breach to North Korean state-sponsored hackers, noting that the laundering techniques and network indicators mirror previous high-profile attacks on the crypto industry. [more]
Axios library compromise - widespread supply chain threat: Unit 42 researchers identified a significant supply chain attack targeting the popular Axios JavaScript library after a maintainer’s account was hijacked to release malicious updates. These compromised versions (v1.14.1 and v0.30.4) do not modify the original source code but instead inject a hidden dependency that serves as a cross-platform remote access Trojan (RAT). The malware is capable of performing stealthy reconnaissance and establishing persistent access across Windows, macOS, and Linux systems before attempting to self-destruct to evade forensic analysis. Because Axios is a fundamental tool used globally for making API requests, this breach poses a systemic risk to thousands of organizations and their downstream digital infrastructure. [more]
OAuth device code phishing on the rise of commoditized identity attacks:
A sophisticated phishing technique leveraging Microsoft’s OAuth 2.0 device code protocol has transitioned from a specialized Russian state-sponsored tactic to a widely accessible Phishing-as-a-Service (PhaaS) model. The “EvilTokens” platform launched in early 2026 and has already compromised over 340 organizations. This attack weaponizes a legitimate authentication flow designed for devices like smart TVs. Victims interact entirely with genuine Microsoft infrastructure. This makes the attack invisible to traditional URL filters and security awareness training. Multifactor authentication offers no protection because users complete the challenge on the attacker’s behalf. Attackers harvest refresh tokens that persist even after password resets. They use these to steal data via the Microsoft Graph API and register unauthorized devices for long-term access. Organizations should prioritize disabling this protocol through Conditional Access policies.
1. Key technology risk pointers
  - Architectural MFA Bypass: Users provide legitimate authentication for the attacker. Existing security investments fail because the protocol itself is exploitable.
  - Persistent Token Access: Stolen refresh tokens survive password changes. Remediation is complex and requires manual session revocation and device audits.
  - Rapid Commoditization: Phishing-as-a-Service makes advanced state-level tactics available to common criminals. The threat is now volumetric and hits all industry sectors.
  - Detection Complexity: Legitimate domains mask the attack. Monitoring must shift to specific behavioral logs within Entra ID to identify unauthorized flows.
Solving the identity paradox: Modern enterprise security is undermined by a fundamental contradiction where increased identity telemetry fails to prevent breaches because attackers now operate behind legitimate, trusted credentials. The rapid expansion of the identity surface to include non-human entities, cloud APIs, and AI agents has outpaced traditional perimeter defenses. Attackers, including state-sponsored insiders and supply chain infiltrators, successfully bypass authentication checkpoints by assuming valid personas. Consequently, static access controls are no longer sufficient. Organizations should consider their transition from a focus on entry-point authentication to continuous post-login behavioral monitoring to distinguish between legitimate employee activity and malicious intent. [more]
1. Key Technology Risk Pointers
  - Non-human identity (NHI) sprawl: Automated service accounts and AI agents often outnumber human users and lack the same governance rigors. These accounts frequently possess broad, persistent privileges, making them high-value targets for machine-speed lateral movement.
  - The authorization gap: Traditional security models prioritize the point of entry but offer little visibility into actions taken after a user is “cleared.” This blind spot allows authenticated attackers to exfiltrate data or modify code while appearing as authorized personnel.
  - Identity subversion via “trusted” insiders: Sophisticated actors are successfully infiltrating organizations through fraudulent hiring and supply chain compromises. Since these identities are technically “valid” in HR and IT systems, they bypass standard security alerts that look for unauthorized access rather than unauthorized intent.

Tech Risk Guru

Ready for more?