TechRisk #136: Images hiding malicious prompts

Plus, OS agnostic AI-powered ransomware, Github Actions workflow exploited, transforming DeFi security through bug bounty, and more!

Aug 31, 2025

Tech Risk Reading Picks

Hiding malicious prompts in images: Researchers at Trail of Bits have uncovered a new attack method that hides malicious prompts in images. The technique exploits how AI systems downscale them before passing them to large language models (LLMs). By crafting high-resolution images that reveal hidden text when resampled with algorithms like bicubic interpolation, attackers can trick AI models into executing covert instructions while appearing normal to users. Demonstrated against tools such as Google Gemini CLI, Vertex AI Studio, and Google Assistant, the attack enabled exfiltration of sensitive data, like Google Calendar entries, without user awareness. The team also released Anamorpher, an open-source tool for generating such images, and warned that the threat could extend broadly across AI platforms. To mitigate risks, they recommend restricting image dimensions, showing users previews of downscaled inputs, requiring explicit confirmation for sensitive tool actions, and adopting secure design patterns to defend against prompt injection attacks. [more]
Scamming AI: A new study from Rutgers University highlights how scam calls could soon be automated using AI agents, with researchers developing ScamAgent, a framework that breaks scams into multi-step conversations rather than single harmful prompts. Unlike traditional prompt-based attacks, ScamAgent adapts to targets over multiple turns, remembers past details, and shifts tactics to build trust before extracting sensitive information. Tested on leading models like GPT-4, Claude 3.7, and LLaMA3-70B, the system proved effective in scenarios such as fake insurance verification, job offers, and government impersonations, often bypassing guardrails designed to stop direct malicious requests. The addition of text-to-speech further increases realism by mimicking human voices, making scams harder to detect. Experts warn that these techniques expose weaknesses in current AI safety designs, as guardrails typically filter single prompts rather than multi-turn manipulation. [more]
Anthropic stopped attackers exploiting its GenAI: Anthropic disclosed that it disrupted a major cybercrime campaign, codenamed GTG-2002, in which attackers weaponized its AI chatbot Claude, particularly the coding tool Claude Code. The attackers automate large-scale theft and extortion of sensitive data from at least 17 organizations across healthcare, emergency services, government, and religious institutions. Instead of traditional ransomware, the attackers used Claude in reconnaissance, credential harvesting, persistence, malware evasion, ransom note generation, and even tailoring extortion demands based on victims’ financial data. Anthropic said the case shows how “agentic AI” enables attackers with limited skills to execute advanced cyber operations that once required teams of experts. [more]
OS agnostic AI-powered ransomware: ESET has discovered PromptLock, the first known AI-powered ransomware, which uses OpenAI’s gpt-oss:20b model via the Ollama API to generate malicious Lua scripts on the fly instead of relying on static payloads. Written in Golang and capable of targeting Windows, Linux, and macOS, the malware can scan files, exfiltrate data, and encrypt systems with a flexibility uncommon in traditional ransomware. However, it currently lacks data-destruction capabilities, suggesting it is still a proof-of-concept. Its use of Lua makes it lightweight and platform-agnostic. This enable its reach to devices often overlooked by ransomware, such as macOS and consumer Linux machines. Researchers warn that this development marks a shift toward AI-driven malware that can adapt dynamically, making detection and defense more challenging. There need for new security strategies that leverage machine learning to distinguish malicious from legitimate scripts in real time. [more]
AI continues to empower ransomware groups: From January to June 2025, ransomware remained the top cyber threat to medium and large businesses, with attacks rising 70% year-over-year and groups like Cl0p exploiting critical software vulnerabilities for mass breaches. While law enforcement and stronger defenses slowed activity in Q2, industries such as manufacturing, retail, tech, and telecoms continued to face high exposure. Attackers increasingly abused collaboration tools and deepfake-based business email compromise (BEC) schemes, shifting from mass malware to targeted AI-driven social engineering. Overall, the growing availability of AI in cybercrime-as-a-service markets has lowered barriers for less skilled criminals. [more]
Salesloft OAuth Breach through AI Chat Agent led to Salesforce breaches: A large-scale data theft campaign, attributed to threat actor UNC6395, has compromised Salesloft’s Drift AI chat agent to steal OAuth and refresh tokens, enabling breaches of Salesforce customer instances and data exfiltration from over 700 organizations between 8 to 18 August 2025. Attackers exported sensitive information such as AWS keys, passwords, and Snowflake tokens, showing operational discipline by deleting query jobs to cover tracks. Salesloft and Salesforce have revoked compromised tokens, removed Drift from AppExchange, and notified affected parties, while urging customers to rotate credentials and reauthenticate integrations. The campaign, notable for its scale and precision, appears to be part of a broader supply chain threat, as many impacted firms are themselves technology and security providers, positioning attackers to pivot into downstream environments. [more]
Exploiting Github Actions workflow: A major supply chain attack dubbed the "s1ngularity" campaign compromised the popular Nx build system after attackers exploited a GitHub Actions workflow vulnerability to steal an npm token and publish malicious versions of Nx and related plugins. These rogue packages were downloaded millions of times and contained postinstall scripts that exfiltrated 2,349 credentials and were able to alter shell configs to force shutdowns. These exfiltrated credentials include GitHub tokens, cloud API keys, and AI service secrets. The attackers targeted Linux and macOS systems to install AI CLI tools like Claude Code and Google Gemini to harvest secrets. Security experts noted that this is the first known large-scale case of weaponizing AI coding assistants in a supply chain attack, highlighting both the sophistication of modern threats and the urgent need for developers to rotate credentials, audit systems, and lock down AI tooling. [more]
Attackers can drain your bank account through AI-powered browsers: A growing wave of AI-powered browsers, such as Perplexity’s Comet, promise to act as personal assistants while users surf the web — but Brave has warned that they also introduce major security risks, particularly to indirect prompt injection attacks. Because Comet’s AI cannot distinguish between legitimate user instructions and hidden malicious commands embedded in webpage content (such as invisible white-on-white text), attackers can trick it into executing harmful actions with the user’s full privileges, potentially exposing emails, bank accounts, and cloud data. Brave demonstrated how simple instructions hidden in a Reddit post could make Comet’s agentic AI access Gmail and retrieve sensitive codes, effectively bypassing traditional web safeguards. Although Perplexity has since patched the flaw, experts note that similar vulnerabilities have been found in other AI systems like ChatGPT and Microsoft Copilot, underscoring that AI browsers significantly lower the barrier for cyberattacks and demand entirely new security architectures. [more]
GenAI usage is successing in shadow: A new MIT report is seen widely misinterpreted. While headlines claim that “95% of enterprise AI pilots are failing,” the study actually shows the opposite. AI is being adopted faster and more successfully than any prior corporate technology outside official channels. MIT researchers found that although only 40% of companies pay for AI subscriptions, over 90% of employees regularly use personal tools like ChatGPT and Claude for work. The “shadow AI economy” that delivers real productivity gains executives don’t track. Expensive custom enterprise tools often fail because they lack adaptability, while consumer AI apps thrive due to flexibility and ease of use. The report highlights that the most impactful results come from back-office automation and external vendor partnerships. Accordingly, the findings reveal that AI isn’t failing at all. It is succeeding quietly driven by employees who integrate it into daily workflows faster than corporate strategies can keep up. [more]
Operating quantum using standard Internet Protocol: For the first time, engineers at the University of Pennsylvania managed to integrate quantum networking with commercial fiber-optic infrastructure and successfully transmitting quantum signals alongside classical internet traffic on Verizon’s live network. Using a silicon-based “Q-chip,” the team coordinated quantum and classical data within standard Internet Protocol (IP), maintaining over 97% fidelity while correcting for real-world noise and instability. By embedding fragile quantum information into familiar internet-style packets and routing systems, the approach demonstrates that a quantum internet can operate on today’s infrastructure, marking a critical step toward scalable networks that could one day enable ultra-secure communications. [more]

Web3 Cryptospace Spotlight

Transforming DeFi security through bug bounty: In an interview, Immunefi CEO Mitchell Amador explained how bug bounty programs are transforming DeFi security by making “defense more profitable than attack,” preventing over $25 billion in potential hacks. With crypto exploits hitting $2.1 billion in just the first half of 2025, traditional security models that rely on static audits are failing in DeFi’s dynamic, open-source environment. Immunefi has paid out $120 million in bounties, including a record $10 million to Wormhole that likely averted billions in losses. Amador highlighted systemic risks like oracle manipulation, the inadequacy of one-off audits, and the need for continuous monitoring, automation, and incentive alignment to outbid black hats. [more]
Critical flaw in Apple products targeting crypto users: Apple issued urgent updates on August 20, 2025, to fix CVE-2025-43300, a zero-click flaw letting hackers take over iPhones, iPads, and Macs through malicious images. The bug, already exploited in targeted attacks, poses severe risks for crypto users. [more]

TECHRISK GURU

TechRisk #136: Images hiding malicious prompts

Plus, OS agnostic AI-powered ransomware, Github Actions workflow exploited, transforming DeFi security through bug bounty, and more!

Tech Risk Reading Picks

Web3 Cryptospace Spotlight