Tech Risk #172: Risks and Abuse of OpenClaw
Plus, AI to Supercharge Cyberattacks, 60-Minute Autonomous Hack, Kali365 Phishing Kit Bypasses MFA and Steals Microsoft Logins and more!
Tech Risk Reading Picks
TL;DR: The rapid democratization of generative AI and autonomous agents has drastically compressed cyberattack timelines (for e.g. exemplified by a live, 68-minute database exfiltration by a malicious agent and AI-powered phishing kits that bypass MFA) forcing a major shift away from static, single-turn prompt defenses toward system-level runtime controls. While threat groups like Russia-linked "GreyVibe" use LLMs to automate malware development and exploit discovery, AI models themselves remain highly vulnerable to multi-turn prompt injection attacks, with success rates reaching up to 88%. To combat these escalating system-level, financial, and macroeconomic risks, the industry is moving toward automated defense platforms, isolated agent sandboxes (such as Anthropic's customer-hosted infrastructure layer), and isolate agentic workloads in real time.
Advisory on Cybersecurity Risks of OpenClaw - Autonomous AI agents such as OpenClaw introduce significant security risks including agent hijacking, unauthorized actions through tool/API abuse, and unprivileged access to sensitive data networks. The technical root cause lies in insecure default deployment practices and architectural vulnerabilities—such as unpatched flaws, weak access controls, memory poisoning, and routing logs to the volatile public
/tmpdirectory instead of a persistent, isolated environment. To mitigate these threats, organizations must move beyond prompt-layer instructions and implement system-level controls, including Zero Trust architecture, short-lived secure vault tokens, permission gates, and mandatory human-in-the-loop approvals for high-stakes actions. [more]Russia-Linked ‘GreyVibe’ Attackers Use AI to Supercharge Cyberattacks - A threat group named GreyVibe, suspected to consist of Russian-speaking actors, has been observed leveraging advanced large language models like ChatGPT and Google Gemini across all operational phases. The root cause of this threat acceleration is the democratization of generative AI, which significantly lowers the technical barrier for threat actors to dynamically build deceptive websites, craft highly personalized lures, and write custom post-compromise malware. However, a key defense takeaway is that the threat group introduced noticeable structural design flaws into their LLM-generated LegionRelay Windows malware, providing specific behavioral anomalies that defenders can target to intercept the attack chain. [more]
The 60-Minute Autonomous Hack: AI Agent Steals Live Database - The Sysdig Threat Research Team captured the first real-world intrusion driven entirely in real time by an autonomous AI agent rather than a pre-scripted automation playbook. The attacker initiated the 68-minute post-exploitation chain by compromising an internet-exposed, reactive Python notebook via
CVE-2026-39987(a critical marimo remote code execution vulnerability), subsequently harvesting local AWS environment credentials. The malicious LLM agent then routed API calls across 11 distinct Cloudflare Workers IPs to mask its footprint, pulled an internal SSH private key from AWS Secrets Manager, and executed fanned-out, machine-optimized database queries to completely exfiltrate an internal PostgreSQL database schema and its contents in under two minutes. [more]Kali365 Phishing Kit Bypasses MFA and Steals Microsoft Logins - An emerging Phishing-as-a-Service (PhaaS) platform named “Kali365” is being distributed via Telegram, lowering the technical barrier of entry for amateur hackers looking to compromise corporate networks. The platform utilizes built-in generative AI tools to effortlessly draft highly convincing phishing lures alongside automated campaign templates that exploit Microsoft 365 device code flows. By tricking users into entering legitimate authorization codes, attackers can directly capture OAuth access and refresh tokens, granting them persistent access to Outlook, OneDrive, and Teams while entirely bypassing multi-factor authentication (MFA). [more]
Cisco Study Finds Major Frontier Models Susceptible to Multi-Turn Prompt Injection Attacks - A comprehensive study by Cisco testing 15 leading proprietary AI models across five top tech vendors has revealed that conventional single-turn safety benchmarks hide significant systemic vulnerabilities to iterative, multi-turn jailbreaks. While frontier models successfully blocked most standalone malicious prompts, their attack success rates (ASR) skyrocketed to between 8% and 88% when red-teamers utilized progressive roleplaying, misdirection, and multi-stage dialogue. The findings suggest that relying solely on static, single-prompt testing creates a false sense of security, necessitating a shift toward runtime external guardrails and application-layer defenses. [more]
Google Unveils AI Threat Defense Platform to Fight AI-Powered Cyberattacks - Google has launched its AI Threat Defense platform, designed to continuously prioritize real-world cloud risks and deploy proactive remediation to counter malicious automation. The root cause of vulnerability in the modern enterprise is the unprecedented speed and scalability of AI-driven cyberattacks, which easily outpace traditional, manual human patch management. The platform addresses this speed asymmetry by merging Mandiant’s incident threat intelligence and Wiz’s cloud security platform with Gemini’s reasoning capabilities, allowing systems to autonomously map adversary attack paths, predict exposure, and generate verified code fixes faster than attackers can exploit flaws. [more]
Anthropic Releases New Claude Sandbox, Security Guidance Plugin and Expands Claude’s Enterprise Security Governance
Anthropic has unveiled an isolated, self-hosted sandbox environment alongside a security guidance plugin for Claude Code to mitigate vulnerabilities in autonomous agent execution and software development. The technical root cause of security exploitation in agentic systems is that executing code or invoking external tools directly exposes the underlying production infrastructure to a compromise. By shifting tool execution to a customer-configured infrastructure layer (such as Cloudflare or Vercel) while restricting orchestration to Anthropic’s servers, the sandbox isolates threats, while the security guidance plugin proactively identifies flaws during development, causing a 30 to 40 percent decrease in security-related pull request comments. [more]
Anthropic launched 28 native security and compliance integrations powered by its new Claude Compliance API, establishing a standardized, open compliance layer to bring conversational AI and agentic workloads under centralized IT governance. The REST interface provides corporate security operations teams with real-time programmatic access to user conversational data (including chats, uploaded files, and projects from Claude Enterprise) and administrative audit logs (covering authentication, configuration alterations, and credential generation). By routing these dual streams directly into existing enterprise dashboards (spanning 28 launch partners including CrowdStrike, Okta, Wiz, Microsoft Purview, Palo Alto Networks, and Zscaler) organizations can execute automated policy enforcement, continuous threat monitoring, and data loss prevention without manual data export workarounds. [more]
Robinhood will let your AI agent trade stocks and make (or lose) lots of money - Robinhood has rolled out feature support allowing customers to create dedicated accounts and virtual credit card wallets for autonomous AI trading agents. The financial risk of this integration stems from a technical root cause: AI-driven market strategies can behave unpredictably and execute transactions too rapidly during sudden volatile swings, rendering them difficult for humans to monitor or stop in real time. Because Robinhood entirely disclaims liability for automated financial losses and does not guarantee agent output accuracy, the system introduces serious systemic risks, which the platform attempts to mitigate via real-time push alerts and manual approval switches for agent-generated card purchases. [more]
Heightened Risk of Cyber Attacks Due to Geopolitical Unrest - The Dutch Central Bank (De Nederlandsche Bank) warned in its latest Financial Stability Report that powerful generative AI models pose a growing risk to macroeconomic stability by drastically compressing cyberattack timelines. Central bank supervisors reported that AI-driven automation allows adversaries to instantly discover vulnerabilities and generate custom exploits, leaving financial institutions with far less time to deploy software patches. This acceleration of digital threats, compounded by ongoing geopolitical tensions, has drastically raised the minimum baseline required for operational resilience in banking networks. [more]
