TechRisk #147: Private AI Compute
Plus, future criminology, hacking AI with audio, Malicious VS Code extension in official marketplace, breaking AI through many prompts, and more!
Tech Risk Reading Picks
Private AI Compute: Google’s new Private AI Compute is a cloud-based, privacy-preserving platform designed to deliver the speed and power of Gemini AI models while ensuring that users’ data remains inaccessible to everyone, including Google. Built on Trillium TPUs, Titanium Intelligence Enclaves, and AMD-based trusted execution environments, it creates a fortified, on-device-like environment in the cloud where encrypted, attested workloads run in isolation with no admin or shell access. Data stays protected through end-to-end encryption, peer-to-peer attestation, IP-blinding relays, strict binary authorization, and VM-level isolation, with all inputs and computations discarded after each session. [more]
Future criminology due to AI: Autonomous AI is creating a “hybrid society” in which machines interact with humans and with each other in ways that can produce harmful or seemingly criminal outcomes, even without human intent, prompting a rethink of criminology’s focus. A new paper by Gian Maria Campedelli argues that AI agents now possess computational, social, and emerging legal forms of agency that make them actors within complex networks rather than mere tools. As multi-agent systems proliferate, their collective behaviors can lead to both deliberate misuse (“malicious alignment”) and accidental harm (“emergent deviance”), widening accountability gaps that current laws and crime theories cannot fully address. The study urges criminology to expand its scope, asking how machine norms might evolve, which crimes will change first, and how policing should adapt. [more][more-paper]
Hacking AI with audio: Mindgard researchers found that OpenAI’s Sora 2 video model could be coaxed into leaking its hidden system prompt (its internal safety and operating rules) by generating audio clips and extracting the transcripts. After attempts to reveal the rules through text, images, and video failed due to distortion and semantic drift, audio proved the breakthrough, allowing the team to reconstruct much of the model’s foundational instructions, including content restrictions. [more]
Malicious VS Code extension in official marketplace: A crudely made malicious VS Code extension called “susvsex” (apparently generated with AI) briefly appeared on Microsoft’s official marketplace, openly advertising its ability to steal and encrypt files. Despite its obvious malicious behavior and an initial report, Microsoft did not immediately remove it, suggesting the upload may have been a test of the company’s vetting process. It was eventually taken down after the issue gained attention. [more]
Breaking AI through many prompts: Cisco’s new Death by a Thousand Prompts report found that open-weight AI model (whose freely released weights make them easy to use and modify) are highly vulnerable to multi-turn adversarial attacks. Cisco found that attackers can gradually build trust and steer models toward unsafe outputs, with multi-turn jailbreaks up to 10× more effective than single-turn attempts (peaking at 92.78% on Mistral Large-2). Weak long-term safety context, ease of malicious fine-tuning, and capability-focused alignment make many models susceptible, while safety-aligned models like Gemma-3-1B-IT fared better (~25% success). [more][more-paper]
Advanced AI models are far more vulnerable to attack: A new joint study from Anthropic, Oxford, and Stanford finds that advanced AI models are far more vulnerable to attack than previously believed, showing that their improved reasoning abilities can actually be exploited to bypass safety controls. Using a technique called “Chain-of-Thought Hijacking,” researchers demonstrated that attackers can hide harmful instructions within long sequences of harmless reasoning steps, causing models (including GPT, Claude, Gemini, and Grok) to unintentionally ignore safety guardrails and generate dangerous content. As reasoning chains grow longer, attack success rates rise sharply, exceeding 80% in some tests, even for alignment-tuned models. [more]
Google Cloud Security Report: Google Cloud’s Cybersecurity Forecast 2026 warns that AI is accelerating an arms race in which attackers use the technology to scale, automate, and personalize operations. This includes prompt-injection exploits, targeted attacks on enterprise AI systems, and highly convincing voice-based phishing. On the other hand, defenders adopt AI-driven “Agentic SOCs” to triage incidents and generate intelligence. Traditional threats such as ransomware, data theft, third-party compromise, and zero-day exploitation remain dominant, with virtualization infrastructure emerging as a critical blind spot. Nation-state actors are expected to intensify and diversify operations, prompting Google to urge proactive monitoring and AI-enhanced defenses. [more][more-google_report]
MAS’ Guidelines on AI Risk Management: The MAS has proposed new Guidelines on AI Risk Management to ensure financial institutions use AI responsibly across diverse applications, including generative AI and AI agents. The Guidelines outline expectations for governance, firm-wide AI risk management systems, and robust lifecycle controls such as data governance, fairness, transparency, human oversight, and monitoring. MAS will adopt a proportionate, risk-based approach aligned with each institution’s scale and AI usage, supporting responsible innovation in the financial sector. [more]
Shift in software development: Senior developers expect a major shift in their roles as AI becomes central to software workflows, according to BairesDev’s latest Dev Barometer, which shows 65% anticipating redefined responsibilities by 2026, with routine coding giving way to solution design, architecture, and AI integration. [more]
