TechRisk #133: Man-in-the-Prompt Attack

Plus, Searchable ChatGPT conversations, AI smart contract attacking agent, Malicious npm with crypto drainer, 5G quantum resisteant private networks, and more!

Aug 10, 2025

gray scale photo of person driving bike on bridge

Tech Risk Reading Picks

Man in the Prompt Attack: A new cyberattack technique, dubbed “Man in the Prompt” by LayerX researchers, exploits common browser extensions to inject malicious instructions into generative AI tools like ChatGPT and Google Gemini by manipulating their prompt fields in the browser’s Document Object Model (DOM). Because many extensions can read or alter DOM content without special permissions, attackers can intercept, modify, or exfiltrate sensitive data from AI interactions. This would turn LLMs into “hacking copilots” capable of leaking corporate secrets. Proof-of-concept demos showed minimal-permission extensions could stealthily inject prompts, hide evidence, and access confidential data, including integrated Google Workspace content. The attack bypasses traditional security tools, which lack DOM-level visibility, making organisations reliant on AI tools particularly vulnerable. LayerX urges shifting security focus to in-browser monitoring, behaviour-based extension blocking, and real-time prompt protection, while experts warn that securing AI workflows now requires protecting not just the models, but the entire browser-based data path where prompts, sensitive information, and third-party integrations intersect. [more]
Searchable ChatGPT conversations (were removed shortly): OpenAI abruptly ended a short-lived ChatGPT feature that let users make conversations searchable via Google, after it was discovered that thousands of private chats. These, include some containing sensitive personal or professional details, were appearing in search results. Although the feature was opt-in and required multiple steps, critics argued the safeguards were inadequate, highlighting a recurring privacy problem in the AI industry, similar to past incidents with Google Bard and Meta AI. [more]
Tricking AI to misclassify malwares: Researchers at Pangea Labs have uncovered a novel cyberattack, dubbed LegalPwn, that exploits a flaw in major generative AI models by hiding malicious code inside fake legal disclaimers and similar legal-sounding text, tricking the AI into misclassifying dangerous malware as safe. Tested against twelve popular models, including Google’s Gemini, GitHub Copilot, OpenAI’s ChatGPT 4.1 and 4o, Meta’s Llama 3.3, and xAI’s Grok, most were found vulnerable. Real-world tools, like Gemini CLI, are even recommending the execution of a reverse shell disguised in a copyright notice. Only a few, such as Anthropic’s Claude 3.5 Sonnet and Microsoft’s Phi 4, showed strong resistance. While human analysts easily identified the threats, AI systems often failed even with explicit safety instructions. Pangea warns against fully automated AI security workflows, recommending human-in-the-loop oversight, dedicated guardrails for detecting injection attempts, and stricter review processes in live environments. [more]
MCPoisoning attack: Cybersecurity researchers have uncovered CVE-2025-54136, a high-severity flaw (CVSS 7.2) in the AI-powered code editor Cursor, dubbed “MCPoison,” that enables remote and persistent code execution by exploiting how the tool handles trusted Model Context Protocol (MCP) configurations. An attacker can introduce a benign MCP file into a shared repository, wait for a victim to approve it, then replace it with malicious commands that run without further prompts, exposing victims to supply chain compromise and data theft. The flaw, rooted in Cursor’s indefinite trust of approved MCP files, was patched in version 1.3 (July 2025) by requiring re-approval on configuration changes. This disclosure comes amid a broader wave of AI tool vulnerabilities (including prompt injections, model poisoning, jailbreaks, and supply chain exploits) highlighting growing risks as large language models become integral to software development, where studies show nearly half of AI-generated code contains serious security flaws. [more]
Anthropic AI tool on code security: Anthropic has launched automated security review tools for its Claude Code platform, embedding AI-driven vulnerability scanning directly into developer workflows via a terminal command and GitHub Action. These tools address the growing security challenge posed by the surge in AI-generated code, which is outpacing traditional manual review methods. Capable of detecting issues like SQL injection, cross-site scripting, and insecure data handling, the system provides inline recommendations and can be customized for enterprise security policies. Tested internally on Anthropic’s own code, it has already prevented vulnerabilities from reaching production. [more]
The continual rise in shadow IT risk: As employees freely adopt SaaS apps, AI tools, and integrations without IT approval, they expand organizational attack surfaces, introduced compliance risks, and created hidden dependencies. While these tools boost productivity, they often bypass security oversight, enabling data leaks, supply chain vulnerabilities, and lingering access from former staff. AI adoption compounds the problem with unmonitored data flows and evolving capabilities. [more]
5G quantum resisteant private networks: Patero, a post-quantum cryptography company, and Eridan, an Open RAN technology firm, have successfully integrated Patero’s CryptoQoR™ suite with Eridan’s 5G radio units to create a quantum-resistant, energy-efficient solution for private 5G networks in diverse environments such as urban, transportation, and agricultural settings. This collaboration combines Eridan’s advanced 5G radios with Patero’s NIST-standardized, quantum-safe cryptography, enhancing data throughput and security against quantum threats. The partnership supports U.S. federal cybersecurity initiatives, including Executive Order 14144 and National Security Memorandum 10, aiming to advance the deployment of resilient, secure networks aligned with national priorities for quantum-safe communications. [more]

Web3 Cryptospace Spotlight

Admin access breached led to $4.5M lost (and returned): 4 Aug 2025 - Solana-based DeFi platform CrediX suffered a major $4.5M exploit after an attacker gained multisig admin and bridge controller access, allowing them to mint fake collateral, exploit smart contract and oracle bugs, and drain liquidity. The attacker obscured funds via Tornado Cash and the Sonic Network before executing the theft. Known for tokenizing real-world credit assets and bridging TradFi with DeFi, CrediX had previously secured strong institutional backing, but has now paused deposits, disabled its website, and is investigating the breach. [more]
1. After negotiation, the hacker behind the $4.5 million CrediX DeFi exploit agreed to return the stolen funds within 24-48, with the hacker compensated by CrediX and users receiving airdropped asset shares. [more]
Multisig wallet compromised: Credix, a decentralized finance (DeFi) protocol, experienced a significant security breach resulting in the theft of approximately $2.64 million from its smart contracts. The attacker exploited over-concentrated administrative permissions by gaining control of the Credix Multisig Wallet and its critical roles, allowing them to mint unbacked tokens and drain liquidity. The stolen funds were quickly laundered via blockchain transfers and privacy tools, complicating recovery efforts. This incident forced Credix to temporarily shut down. [more]
Malicious npm with crypto drainer: Cybersecurity firm Safety has uncovered a malicious npm package, @kodane/patch-manager, allegedly generated with Anthropic’s Claude AI, that secretly contained a Solana cryptocurrency wallet drainer. Uploaded on 28 July 2025, by user “Kodane” and downloaded over 1,500 times before removal, the package posed as a Node.js utility but used a postinstall script to drop its payload on Windows, Linux, and macOS, connect to a C2 server, and scan for wallet files to steal funds. Its AI-generated nature was evident from stylistic markers like emojis, polished markdown, and verbose logging. [more]
AI smart contract attacking agent: Researchers from University College London and the University of Sydney have developed “A1,” an autonomous AI agent capable of scanning smart contracts for vulnerabilities, writing and testing working exploits in Solidity, and executing crypto theft within minutes. These tasks once requiring teams of skilled attackers. Unlike traditional tools, A1 mimics a human hacker by reasoning through contract logic step-by-step, coordinating helper contracts, and only reporting issues if proof-of-concept code passes validation. Public, rule-based smart contracts are especially vulnerable because AI can easily access and simulate their execution flows, quickly refining attacks based on real-time blockchain feedback. In tests, A1 discovered novel flaws (including ones emerging after its training cutoff) that eluded conventional fuzzers and security audits, sometimes outperforming average human engineers. The findings highlight the growing risk of AI-powered exploits, with experts urging DeFi teams to use such agents proactively for defense, as attackers can automate and scale operations far more effectively than before. [more]

TECHRISK GURU

TechRisk #133: Man-in-the-Prompt Attack

Plus, Searchable ChatGPT conversations, AI smart contract attacking agent, Malicious npm with crypto drainer, 5G quantum resisteant private networks, and more!

Tech Risk Reading Picks

Web3 Cryptospace Spotlight