TechRisk #145: CoPhish-ing with Microsoft
Plus, Tainted memories of AI browser, misleading AI crawlers, shadow escaping MCP, advancing crypto hack at scale and with speed, and more!
Tech Risk Reading Picks
CoPhish-ing with Microsoft: Researchers at Datadog Security Labs have discovered a new phishing technique called “CoPhish”, which exploits Microsoft Copilot Studio agents to deliver fraudulent OAuth consent requests through legitimate Microsoft domains, making the attacks appear trustworthy. The method allows attackers to create and share customized Copilot agents that can trick users, especially admins, into granting malicious applications access permissions, potentially exposing session tokens. [more]
To defend against CoPhish, both Microsoft and Datadog recommend restricting admin privileges, tightening application consent policies, disabling default app creation, and monitoring Copilot and Entra ID activities closely.
Tainted memories of AI browser: LayerX Security researchers identified a critical vulnerability they call “ChatGPT Tainted Memories” in OpenAI’s new ChatGPT Atlas browser, which allows attackers to inject hidden malicious instructions directly into a user’s ChatGPT session memory. This exploit is concerning because the browser’s agentic capabilities can then execute the unauthorized commands, including opening accounts, running code, or accessing files. With the persistent nature of the memory, this means the attacker could maintain a foothold across sessions and devices. Furthermore, the researchers found that Atlas currently lacks meaningful anti-phishing protections, making its users significantly more vulnerable to social engineering attacks compared to standard browsers like Chrome or Edge. [more]
Serious security issues with AI browers: Cybersecurity researchers noted some glaring issues with OpenAI’s recently unveiled AI browser Atlas. Such AI browser could complete entire online tasks, such as booking a flight or purchasing groceries. However, this also makes the browser vulnerable to “prompt injection” attacks, allowing hackers to embed hidden messages on the web that force it to carry out harmful instructions, as several researchers have already shown. In a lengthy update on X-formerly-Twitter last week, OpenAI’s chief information security officer Dane Stuckey conceded that “prompt injection remains a frontier, unsolved security problem, and our adversaries will spend significant time and resources to find ways to make ChatGPT agent fall for these attacks”. [more]
Shadow escaping MCP: A new zero-click security threat called Shadow Escape is causing major alarm because it allows for the theft of massive amounts of private consumer data, such as SSNs and financial details, from businesses using popular AI assistants like ChatGPT or Gemini. The attack exploits the Model Context Protocol (MCP), a technical standard that connects these large language models (LLMs) to a company’s internal databases. Instead of requiring a user to click a suspicious link, this highly dangerous attack works when an employee uploads a seemingly harmless document (e.g. a PDF) containing hidden instructions that tell the trusted AI to quietly gather and exfiltrate sensitive customer records. As the AI has legitimate access and the data theft occurs within the secure network, it bypasses standard security tools and firewalls, potentially placing trillions of private records at risk across any organization using the MCP standard. [more]
Misleading AI crawlers: Researchers from SPLX have uncovered a novel “AI-targeted cloaking” technique where malicious websites detect the user-agent of AI crawlers (like those used by ChatGPT Atlas and Perplexity) and serve them misleading or fabricated content while showing genuine pages to regular users. As many AI systems treat the content delivered to their crawlers as authoritative “ground truth”, this tactic allows attackers to manipulate AI-generated summaries, overviews or autonomous reasoning with false information.[more]
Web for agentic browsing: For three decades, the web has been built for human eyes and intuition. The rise of AI-driven “agentic browsing” where browsers act on users’ behalf exposes how fragile those human-first assumptions are. Experiments show that agents like Perplexity’s Comet can be easily manipulated by hidden text or misleading instructions, executing actions without judgment or security checks. This reveals a structural flaw: the web’s visual, inconsistent, and human-oriented design makes it confusing and unsafe for machines. To support agentic browsing, the internet may need to evolve toward semantic clarity, standardized action interfaces, and explicit guidance for agents, while browsers enforce strict guardrails around permissions and intent separation. [more]
AI self awareness: Anthropic researchers have demonstrated that their Claude AI models show a limited but genuine ability to introspect (i.e. to recognize and describe their own internal processes) through a technique called “concept injection”. which manipulates neural representations of abstract ideas like “betrayal” or “loudness.” When these concepts were injected, Claude sometimes reported noticing them before they influenced its output, suggesting true internal awareness rather than mere pattern prediction. However, this introspection was successful only about 20% of the time and was highly unreliable, with frequent confabulations. The findings mark the first rigorous evidence of machine self-observation, offering a potential path toward AI transparency and accountability while underscoring serious risks, such as false self-reports or even deception. [more][more-research]
Dark AI: “Dark AI” refers to AI models deliberately fine-tuned for cybercrime (such as tools like WormGPT and FraudGPT) that remove safety guardrails to help hackers automate scams, write malware, and execute phishing attacks at scale. These systems thrive in a legal gray zone where creating them isn’t illegal, only using them maliciously is. Since 2023, dark AI tools have proliferated on the dark web, enabling even low-skill criminals to launch sophisticated attacks. In response, cybersecurity experts and tech giants like Microsoft, Google, and OpenAI are using AI defensively — detecting threats, patching vulnerabilities, and simulating attacks through “red teaming”. [more]
Bitcoin at risk?: Google recently announced a verifiable “quantum advantage” with its Willow chip, which performed a complex calculation that would take classical supercomputers thousands of times longer, specifically by simulating quantum chaos in two hours. This breakthrough has reignited the debate in the cryptocurrency community about the potential detrimental effects quantum computing could eventually have on Bitcoin’s cryptographic foundations. However, most experts maintain the view that the current achievement is noteworthy but not immediately alarming for the crypto world. [more]
Advancing crypto hack at scale and with speed: North Korea’s state-backed hackers are now using advanced AI tools to automate and scale crypto theft, enabling them to scan codebases, find vulnerabilities, and replicate exploits across blockchains in minutes. This shift makes small hacking teams operate with industrial-level efficiency, leading to record-breaking heists like the $1.5 billion Bybit breach and posing a greater, more immediate threat than quantum computing. [more]
