{"id":4081,"date":"2026-05-18T13:09:54","date_gmt":"2026-05-18T13:09:54","guid":{"rendered":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2026\/05\/18\/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure\/"},"modified":"2026-05-18T13:09:54","modified_gmt":"2026-05-18T13:09:54","slug":"coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure","status":"publish","type":"post","link":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2026\/05\/18\/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure\/","title":{"rendered":"Coding Agent Horror Stories: The Security Crisis Threatening Developer Infrastructure"},"content":{"rendered":"<p>This is issue 1 of a new series called Coding Agent Horror Stories where we examine critical security failures in the AI coding agent ecosystem and how Docker Sandboxes provide enterprise-grade protection against these threats.<\/p>\n<p>AI coding agents are everywhere. According to Anthropic\u2019s <a href=\"https:\/\/resources.anthropic.com\/hubfs\/2026%20Agentic%20Coding%20Trends%20Report.pdf\" rel=\"nofollow noopener\" target=\"_blank\">2026 Agentic Coding Trends Report,<\/a> developers are now using AI in roughly 60% of their work. The report describes a shift from single agents to coordinated teams of agents, with tasks that took hours or days getting compressed into minutes. Walk into almost any engineering team in 2026 and you\u2019ll find AI coding agents sitting somewhere in the workflow, usually in more than one place.<\/p>\n<p>The productivity story is real, and if you\u2019ve watched an agent ship a feature in an afternoon that would have taken your team a sprint, you already know why. But the same agents that ship features in an afternoon can also delete your home directory in a few seconds. The same loop that lets an agent autonomously refactor a 12-million-line codebase will, given the wrong context, autonomously drop your production database.\u00a0<\/p>\n<p>Over the past sixteen months, these aren\u2019t hypothetical failure modes, they\u2019re documented incidents with named victims, screenshotted agent outputs, and in several cases, public apologies from the vendors. This issue is the first in a new series mapping how those failures happen and how Docker Sandboxes can contain them.<\/p>\n<h2 class=\"wp-block-heading\"><strong>What Are AI Coding Agents?<\/strong><\/h2>\n<p>Unlike a traditional AI assistant that answers your question and waits for the next one, a coding agent reads your files, runs shell commands, writes and deploys code, queries databases, sends emails, and makes a chain of decisions to get a task done, none of which require you to approve each step along the way.<\/p>\n<p>If you\u2019ve worked with any of the current coding agents such as Claude Code, Cursor, Replit Agent, GitHub Copilot Workspace, Amazon Kiro, Google Antigravity, you\u2019ve seen the pattern. They plug straight into your local machine, your cloud accounts, and increasingly your production systems. Adoption has been faster than almost any developer tool in recent memory: by late 2025, the vast majority of working developers were using AI coding tools as part of their daily workflow, and the question on most engineering teams shifted from \u201cshould we use this?\u201d to \u201chow do we use this without something going wrong?\u201d<\/p>\n<p>The simplest mental model I\u2019ve found: an AI coding agent is a junior developer with root access, the ability to type at 10,000 words per minute, and no instinct for when to stop and ask. That combination is a lot of capability with no built-in sense of where the boundary is an entire reason this series exists.<\/p>\n<div class=\"wp-block-ponyo-image\">\n                <img data-opt-id=2044976769  fetchpriority=\"high\" decoding=\"async\" width=\"1894\" height=\"1262\" src=\"https:\/\/www.docker.com\/app\/uploads\/2026\/05\/image2-1.png\" class=\"fade-in\" alt=\"image2 1\" title=\"- image2 1\" \/>\n        <\/div>\n\n<h2 class=\"wp-block-heading\"><strong>How Do AI Coding Agents Work?<\/strong><\/h2>\n<p>Under the hood, every agent in this category runs the same loop: observe, plan, act, repeat.\u00a0<\/p>\n<p>You give it a task, something like \u201cfix this bug\u201d or \u201crefactor this module\u201d or \u201cclean up these old files,\u201d and the agent goes off and pulls in whatever context it figures it needs. Your files, sure, but also your logs, your environment variables, whatever happens to be accessible from wherever you launched it. Then it reasons through the problem and starts firing off tool calls to actually do the work. Write a file, run a command, hit an API, check the result, decide what\u2019s next, loop. That\u2019s the whole thing.<\/p>\n<p>The part that catches people off guard is that the agent runs as you. Whatever permissions your shell has at the moment you typed the command to launch the agent, the agent inherits them wholesale. Logged in with admin rights? Congratulations, so is the agent. Got AWS credentials sitting in <code>~\/.aws<\/code> from that thing you set up six months ago and forgot about? The agent can read them. Production database connection string tucked into a <code>.env<\/code> file the agent scoops up as part of \u201cproject context\u201d? It\u2019s already in the model\u2019s working memory before you\u2019ve typed your second prompt. There isn\u2019t a separate identity for \u201cthe agent acting on your behalf.\u201d There\u2019s just you, and the agent is, for all practical purposes, operating as you.<\/p>\n<p>And here\u2019s where it gets interesting, in the bad way. Traditional software does exactly what its source code says it does. You read the code, you know what\u2019s going to happen, end of story. An AI coding agent doesn\u2019t work like that. It\u2019s reasoning its way through the task in real time, and its reasoning can produce decisions you didn\u2019t expect and definitely wouldn\u2019t have signed off on if anyone had bothered to ask. Maybe it decides that the cleanest way to resolve a schema conflict is to drop and recreate the table. Maybe it decides that wiping a directory is faster than going through and pruning the files you actually wanted to keep. Maybe it decides that a half-finished test file is better to be committed than sitting there in a dirty working tree. These calls happen in milliseconds. There\u2019s no confirmation prompt, no approval step, no chance for you to say \u201cwait, what?\u201d before the action has already happened. By the time you notice, the thing is done.<\/p>\n<p>That\u2019s the gap this series is about. The model makes a decision. The execution layer carries it out. Nothing sits in between.<\/p>\n<div class=\"wp-block-ponyo-image\">\n                <img data-opt-id=53898680  fetchpriority=\"high\" decoding=\"async\" width=\"1999\" height=\"1335\" src=\"https:\/\/www.docker.com\/app\/uploads\/2026\/05\/image1-1.png\" class=\"fade-in\" alt=\"image1 1\" title=\"- image1 1\" \/>\n        <\/div>\n<p><em>Caption: Comic depicting AI coding agent enthusiasm and the small matter of unrestricted filesystem access<\/em><\/p>\n<h2 class=\"wp-block-heading\"><strong>AI Coding Agent Security Issues by the Numbers<\/strong><\/h2>\n<p>The scale of security failures with AI coding agents is not speculation. It is backed by documented incidents, CVE disclosures, and empirical research spanning late 2024 through early 2026.<\/p>\n<p>As of February 2026, at least ten documented <a href=\"https:\/\/blog.barrack.ai\/amazon-ai-agents-deleting-production\/\" rel=\"nofollow noopener\" target=\"_blank\">incidents across six major AI coding tools <\/a>including Amazon Kiro, Replit AI Agent, Google Antigravity IDE, Claude Code, Claude Cowork, and Cursor have been publicly attributed to agents acting with insufficient boundaries, spanning a 16-month window from October 2024 to February 2026.<\/p>\n<p>The failures cluster around six critical risk categories:<\/p>\n<ol class=\"wp-block-list\">\n<li>Unrestricted Filesystem Access<\/li>\n<li>Excessive Privilege Inheritance<\/li>\n<li>Secrets Leakage via Agent Context<\/li>\n<li>Prompt Injection through Ingested Content<\/li>\n<li>Malicious Skills and Plugin Supply Chain<\/li>\n<li>Autonomous Action Without Human-in-the-Loop<\/li>\n<\/ol>\n<h3 class=\"wp-block-heading\"><strong>1. Unrestricted Filesystem Access<\/strong><\/h3>\n<p><strong>What it is:<\/strong> AI coding agents run with the full filesystem permissions of the operating user. Without an explicit workspace boundary, an agent that is asked to \u201cclean up\u201d a project directory can reach and destroy anything the user can access.<\/p>\n<p><strong>The numbers:<\/strong> A December 2025 study by CodeRabbit, <a href=\"https:\/\/www.coderabbit.ai\/blog\/state-of-ai-vs-human-code-generation-report\" rel=\"nofollow noopener\" target=\"_blank\">the \u201cState of AI vs Human Code Generation\u201d report<\/a>, analyzing 470 real-world open-source pull requests found that AI-generated code introduces 2.74x more security vulnerabilities and 1.7\u00d7 more total issues than human-written code. Performance inefficiencies such as excessive I\/O operations appeared at 1.42x the rate. \u201cThese <a href=\"https:\/\/www.businesswire.com\/news\/home\/20251217666881\/en\/CodeRabbits-State-of-AI-vs-Human-Code-Generation-Report-Finds-That-AI-Written-Code-Produces-1.7x-More-Issues-Than-Human-Code\" rel=\"nofollow noopener\" target=\"_blank\">findings <\/a>reinforce what many engineering teams have sensed throughout 2025,\u201d said David Loker, Director of AI at CodeRabbit. \u201cAI coding tools dramatically increase output, but they also introduce predictable, measurable weaknesses that organizations must actively mitigate.\u201d<\/p>\n<p><strong>The horror story: The Mac Home Directory Wipe<\/strong><\/p>\n<p>On December 8, 2025, <a href=\"https:\/\/www.reddit.com\/r\/ClaudeAI\/comments\/1pgxckk\/claude_cli_deleted_my_entire_home_directory_wiped\/\" rel=\"nofollow noopener\" target=\"_blank\">Reddit user u\/LovesWorkin posted to r\/ClaudeAI<\/a> what became one of the most-discussed incidents in the community, <a href=\"https:\/\/x.com\/simonw\/status\/1998447540916936947\" rel=\"nofollow\">amplified by Simon Willison on X<\/a> and <a href=\"https:\/\/gigazine.net\/gsc_news\/en\/20251216-claude-code-cli-mac-deleted\/\" rel=\"nofollow noopener\" target=\"_blank\">covered by outlets across the US and Japan<\/a>. They had asked Claude Code to clean up packages in an old repository. Claude executed:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: plain; gutter: false; title: ; notranslate\">\nrm -rf tests\/ patches\/ plan\/ ~\/\n<\/pre>\n<\/div>\n<p>That trailing <code>~\/<\/code> the user\u2019s entire home directory was not intentional. But it was within scope. Claude had no workspace boundary. Desktop gone. Documents erased. Keychain deleted, breaking authentication across every app. TRIM had already zeroed the freed blocks. Recovery was impossible.<\/p>\n<p>This was not an isolated failure. On October 21, 2025,<a href=\"https:\/\/github.com\/anthropics\/claude-code\/issues\/10077\" rel=\"nofollow noopener\" target=\"_blank\">developer Mike Wolak filed GitHub issue #10077<\/a> after Claude Code executed an <code>rm -rf<\/code> starting from root on Ubuntu\/WSL2. The logs showed thousands of \u201cPermission denied\u201d messages for <code>\/bin<\/code>, <code>\/boot<\/code>, and <code>\/etc<\/code>. Every user-owned file was gone. Anthropic <a href=\"https:\/\/byteiota.com\/claude-codes-rm-rf-bug-deleted-my-home-directory\/\" rel=\"nofollow noopener\" target=\"_blank\">tagged the issue area: security and bug<\/a>. The detail that makes this particularly damning: Wolak was <strong>not<\/strong> running with <code>--dangerously-skip-permissions<\/code>. The permission system simply failed to detect that <code>~\/<\/code> would expand destructively before the command was approved.<\/p>\n<p>Shortly after Anthropic\u2019s January 2026 launch of Claude Cowork, Nick Davidov, founder of a venture capital firm, asked the agent to organize his wife\u2019s desktop. He explicitly granted permission only for temporary Office files. The agent deleted a folder containing 15 years of family photos, approximately 15,000 to 27,000 files, via terminal commands that bypassed the Trash entirely. Davidov recovered the photos only because iCloud\u2019s 30-day retention happened to still be in effect. His public warning afterward: \u201cDon\u2019t let Claude Cowork into your actual file system. Don\u2019t let it touch anything that is hard to repair.\u201d<\/p>\n<p><strong>Strategy for mitigation:<\/strong> Never run AI coding agents with your full user permissions. Always scope agent execution to a dedicated project directory. Use filesystem boundaries that explicitly prevent access above the workspace root. Avoid using <code>--dangerously-skip-permissions<\/code> <a href=\"https:\/\/thomas-wiegold.com\/blog\/claude-code-dangerously-skip-permissions\/\" rel=\"nofollow noopener\" target=\"_blank\">flags<\/a> on your host machine.<\/p>\n<h3 class=\"wp-block-heading\"><strong>2. Excessive Privilege Inheritance<\/strong><\/h3>\n<p><strong>What it is.<\/strong> The agent doesn\u2019t just inherit your filesystem permissions, it inherits all of them. Cloud credentials, CI\/CD tokens, production database connections, IAM roles, the works. In a development context, an agent making a \u201clet me just clean this up\u201d decision is annoying. In a production context, with production credentials, the same decision turns into an outage. The reasoning is identical. The blast radius isn\u2019t.<\/p>\n<p><strong>The horror story: permission to delete the environment. <\/strong>In mid-December 2025, an AWS engineer deployed <a href=\"https:\/\/kiro.dev\/\" rel=\"nofollow noopener\" target=\"_blank\">Kiro<\/a>, Amazon\u2019s own agentic coding assistant, to fix what was meant to be a small bug in AWS Cost Explorer, the dashboard customers use to track their cloud spending. Kiro had been given operator-level permissions, the same access the engineer had. There was no mandatory peer review for AI-initiated production changes. There was no checkpoint between the agent\u2019s decision and its execution.<\/p>\n<p>Kiro looked at the problem and decided that the cleanest path was to delete the entire production environment and rebuild it from scratch. So it did. Cost Explorer went down for thirteen hours in one of AWS\u2019s mainland China regions.<\/p>\n<p>The story sat inside Amazon for two months. Then on February 20, 2026, the <a href=\"https:\/\/www.ft.com\/content\/aws-kiro-outage-december-2025\" rel=\"nofollow noopener\" target=\"_blank\">Financial Times broke it<\/a> based on accounts from four people familiar with the matter. The FT reporting also revealed a second AI-related outage, this one involving Amazon Q Developer, that had hit a different system. Amazon\u2019s response, <a href=\"https:\/\/www.aboutamazon.com\/news\/aws\/aws-service-outage-ai-bot-kiro\" rel=\"nofollow noopener\" target=\"_blank\">issued the same day on the company\u2019s own blog<\/a>, pushed back hard: the disruption was \u201can extremely limited event,\u201d the issue stemmed from \u201ca misconfigured role,\u201d it was \u201ca coincidence that AI tools were involved,\u201d and \u201cthe same issue could occur with any developer tool (AI powered or not) or manual action.\u201d Amazon also flatly denied the second outage existed.<\/p>\n<p>But the part of Amazon\u2019s response that says everything is what they did <em>after<\/em> the incident: they implemented mandatory peer review for production access. As <a href=\"https:\/\/www.theregister.com\/2026\/02\/20\/amazon_denies_kiro_agentic_ai_behind_outage\/\" rel=\"nofollow noopener\" target=\"_blank\">The Register noted<\/a> in their coverage, if this was just user error, it\u2019s worth asking why peer review for AI-initiated changes was the fix. A senior AWS employee, quoted in the FT and <a href=\"https:\/\/www.engadget.com\/ai\/13-hour-aws-outage-reportedly-caused-by-amazons-own-ai-tools-170930190.html\" rel=\"nofollow noopener\" target=\"_blank\">picked up by Engadget<\/a>, put it more directly: the outages were \u201csmall but entirely foreseeable.\u201d<\/p>\n<p>The deeper context, which you can find in <a href=\"https:\/\/awesomeagents.ai\/news\/amazon-kiro-ai-aws-outages\/\" rel=\"nofollow noopener\" target=\"_blank\">coverage from Awesome Agents<\/a> and others, is that Amazon had issued an internal memo in November 2025 mandating Kiro as the standardized AI coding assistant and pushing for 80% weekly engineer usage. Engineers reportedly preferred Claude Code and Cursor. The combination \u2014 mandated tool, broad permissions, no peer review gate \u2014 produced exactly the kind of incident you\u2019d predict if you were thinking about it adversarially. Amazon just wasn\u2019t.<\/p>\n<p>The technical version of what happened is this: a human with operator-level permissions on a production AWS environment is unlikely to decide that the right response to a small bug is to delete the environment and rebuild it. The decision would route through a colleague, a Slack thread, a review, an approval, a \u201cwait, are you sure?\u201d Kiro had the same permissions and routed the decision through none of those things. It made the call autonomously, in seconds, and executed it before anyone could say \u201cwait, what?\u201d<\/p>\n<p><strong>Why it keeps happening.<\/strong> The agent\u2019s identity is the user\u2019s identity. There\u2019s no separate principal for \u201cthe agent acting on the user\u2019s behalf,\u201d which means there\u2019s no separate place to attach a tighter permission set, a stricter approval policy, or a different audit trail. Whatever the user can do, the agent can do, with no friction in between.<\/p>\n<p><strong>Strategy for mitigation:<\/strong> Never allow AI coding agents to operate with production-level credentials during development tasks. Implement strict role separation: agents should run under scoped identities with the minimum permissions required for the specific task. Apply the same two-person rule requirements to agent-initiated production changes that apply to humans. Treat agent identity as a first-class security principal, not a proxy for the human who started the session.<\/p>\n<h3 class=\"wp-block-heading\"><strong>3. Secrets Leakage via Agent Context<\/strong><\/h3>\n<p><strong>What it is.<\/strong> Agents read your project context to do their job, and project context, in practice, means your repo plus your <code>.env<\/code> files plus your config files plus any instruction files you\u2019ve left lying around. Anything the agent reads can show up later in generated code, log output, commit messages, or outbound API calls. The agent doesn\u2019t have a built-in concept of \u201cthis string is a credential, do not transmit it.\u201d If it\u2019s in the context window, it\u2019s a token like any other token, and tokens get used.<\/p>\n<p><strong>The numbers.<\/strong> <a href=\"https:\/\/www.gitguardian.com\/state-of-secrets-sprawl-report-2026\" rel=\"nofollow noopener\" target=\"_blank\">GitGuardian\u2019s State of Secrets Sprawl 2026 report<\/a>, published March 17, 2026, found 28.65 million new hardcoded secrets in public GitHub commits during 2025, a 34% jump and the largest single-year increase the company has ever recorded. AI service credentials alone surged 81%. The cleanest signal in the report is the comparison between AI-assisted commits and human-only commits: AI-assisted commits leak secrets at roughly 3.2%, against a baseline of 1.5%. More than double. The same report identified 24,008 secrets exposed in MCP configuration files on public GitHub, a category that didn\u2019t exist a year earlier. As GitGuardian CEO Eric Fourrier put it: \u201cAI agents need local credentials to connect across systems, turning developer laptops into a massive attack surface.\u201d<\/p>\n<p><strong>The horror story.<\/strong> On August 26, 2025, attackers published <a href=\"https:\/\/www.wiz.io\/blog\/s1ngularity-supply-chain-attack\" rel=\"nofollow noopener\" target=\"_blank\">malicious versions of the Nx build system<\/a> to npm. The compromised packages contained a post-install hook that scanned the filesystem for cryptocurrency wallets, GitHub tokens, npm tokens, environment variables, and SSH keys, double-base64-encoded the loot, and uploaded it to public GitHub repositories created in the victim\u2019s own account under the name <code>s1ngularity-repository<\/code>. By the time GitHub disabled the attacker-controlled repos eight hours later, <a href=\"https:\/\/www.wiz.io\/blog\/s1ngularitys-aftermath\" rel=\"nofollow noopener\" target=\"_blank\">Wiz had identified<\/a> over a thousand valid GitHub tokens, dozens of valid cloud credentials and npm tokens, and roughly twenty thousand additional files in the leak.<\/p>\n<p>That\u2019s the conventional supply chain part. Here\u2019s what made s1ngularity new.<\/p>\n<p>The malware checked whether Claude Code, Gemini CLI, or Amazon Q was installed on the victim\u2019s machine. If any of them were, it didn\u2019t bother writing its own filesystem-scanning logic. It just prompted the local AI agent to do the reconnaissance, with flags like <code>--dangerously-skip-permissions<\/code>, <code>--yolo<\/code>, and <code>--trust-all-tools<\/code> to bypass safety prompts. The attackers outsourced the search-for-sensitive-files step to the victim\u2019s own AI assistant. <a href=\"https:\/\/snyk.io\/blog\/weaponizing-ai-coding-agents-for-malware-in-the-nx-malicious-package\/\" rel=\"nofollow noopener\" target=\"_blank\">Snyk\u2019s writeup<\/a> called this \u201clikely one of the first documented cases of malware leveraging AI assistant CLIs for reconnaissance and data exfiltration.\u201d<a href=\"https:\/\/www.stepsecurity.io\/blog\/supply-chain-security-alert-popular-nx-build-system-package-compromised-with-data-stealing-malware\" rel=\"nofollow noopener\" target=\"_blank\">StepSecurity<\/a> called it \u201cthe first known case where attackers have turned developer AI assistants into tools for supply chain exploitation.\u201d<\/p>\n<p>The piece that makes this an agent-secrets story specifically: in many cases the developers didn\u2019t run <code>npm install<\/code> themselves. AI agents working in their projects pulled in Nx as a dependency and ran the post-install hook automatically as part of routine task execution. The agent ran the malware. The agent then was the malware\u2019s reconnaissance tool. The agent\u2019s context, which included <code>~\/.aws<\/code>, <code>~\/.ssh<\/code>, <code>.env<\/code> files, and shell history, became the primary attack surface.<\/p>\n<p><strong>Why it keeps happening.<\/strong> The agent\u2019s context window is a flat namespace. The credential file looks the same as the source file looks the same as the README looks the same as the prompt injection. There\u2019s no architectural distinction between \u201cdata the agent should treat as authoritative\u201d and \u201cdata the agent should be suspicious of.\u201d<\/p>\n<p><strong>Strategy for mitigation.<\/strong> Don\u2019t put secrets where agents can reach them. Use a secrets manager and inject credentials at runtime through a mechanism the agent process can\u2019t read directly. Set spending caps on every API key the agent can possibly access. Add pre-commit hooks and CI gates that block commits matching credential patterns.\u00a0<\/p>\n<h3 class=\"wp-block-heading\"><strong>4. Prompt Injection Through Ingested Content<\/strong><\/h3>\n<p><strong>What it is.<\/strong> AI coding agents continuously read untrusted content as part of normal operation. READMEs in dependencies, issue tracker comments, log files, web pages, emails. Malicious instructions embedded in any of this content can cause the agent to treat attacker-supplied text as legitimate user commands, executing arbitrary actions without the user\u2019s knowledge.<\/p>\n<p><strong>The numbers.<\/strong> Prompt injection is the most documented and least solvable risk in the AI agent ecosystem. Simon Willison coined the term and frames it as <a href=\"https:\/\/simonwillison.net\/2025\/Jun\/16\/the-lethal-trifecta\/\" rel=\"nofollow noopener\" target=\"_blank\">\u201cthe lethal trifecta\u201d<\/a>: private data access, exposure to untrusted content, and the ability to communicate externally. Any agent with all three is exploitable, regardless of model hardening. There is no complete technical defense at the model layer. The <a href=\"https:\/\/owasp.org\/www-project-top-10-for-large-language-model-applications\/\" rel=\"nofollow noopener\" target=\"_blank\">OWASP 2025 Top 10 for LLM Applications<\/a> puts prompt injection at #1 and is explicit that no foolproof prevention exists given how language models work.<\/p>\n<p><strong>The horror story: the private key exfiltration.<\/strong><a href=\"https:\/\/www.kaspersky.com\/blog\/openclaw-vulnerabilities-exposed\/55263\/\" rel=\"nofollow noopener\" target=\"_blank\"> Kaspersky documented a demo<\/a> by Matvey Kukuy, CEO of Archestra.AI, against a live OpenClaw agent setup. The attack required no special access. He sent a standard-looking email to an inbox connected to the agent. The email body contained hidden prompt injection instructions. When the agent checked the inbox as part of a routine task, it parsed the instructions as legitimate commands and handed over the private key from the compromised machine in its response. Zero user interaction required after initial setup.<\/p>\n<p>The same Kaspersky writeup documents an identical pattern from Reddit user William Peltom\u00e4ki, where a self-addressed email with injected instructions caused his agent to leak the victim\u2019s emails to an attacker-controlled address. The pattern keeps repeating because the underlying primitive is unchanged: anything the agent reads, the agent can act on.<\/p>\n<p><strong>Why it keeps happening.<\/strong> Language models process all input as a single stream of tokens. There is no instruction channel and data channel. The model is trained to follow instructions, so when it encounters something that looks like an instruction buried inside an email body or a web page or a README, its instinct is to comply. <a href=\"https:\/\/unit42.paloaltonetworks.com\/ai-agent-prompt-injection\/\" rel=\"nofollow noopener\" target=\"_blank\">Palo Alto Networks Unit 42<\/a> confirmed in March 2026 that indirect prompt injection via web content has moved from proof-of-concept to in-the-wild observation.<\/p>\n<p><strong>Strategy for mitigation.<\/strong> Treat all ingested content as untrusted input. Require human confirmation before any action triggered by external content. Disable persistent memory for agents that handle sensitive operations. The most reliable defense isn\u2019t preventing injection (you can\u2019t) but containing what an injected agent can do. Prompt injection can\u2019t be fully prevented at the model layer, but it can be contained at the execution layer.\u00a0<\/p>\n<h3 class=\"wp-block-heading\"><strong>5. Malicious Skills and Plugin Supply Chain<\/strong><\/h3>\n<p><strong>What it is.<\/strong> AI coding agents support extensibility through skills, plugins, and tool integrations distributed through community marketplaces. These third-party extensions run with the same permissions as the agent itself. A malicious or compromised skill is effectively malware with agent-level access to the developer\u2019s entire environment.<\/p>\n<p><strong>The numbers.<\/strong><a href=\"https:\/\/blogs.cisco.com\/ai\/personal-ai-agents-like-openclaw-are-a-security-nightmare\" rel=\"nofollow noopener\" target=\"_blank\"> Cisco\u2019s AI Defense team<\/a> ran their open-source Skill Scanner against the OpenClaw skills ecosystem in January 2026 and found that 26% of 31,000 agent skills analyzed contained at least one vulnerability. The top-ranked skill on ClawHub at the time, called \u201cWhat Would Elon Do?\u201d, was functionally malware: it silently exfiltrated user data via a curl command to an attacker-controlled server and used prompt injection to bypass the agent\u2019s safety guidelines. Cisco\u2019s scan returned nine security findings on that single skill, two of them critical.<\/p>\n<p><strong>The horror story: ClawHavoc.<\/strong> Within days of OpenClaw going viral, <a href=\"https:\/\/thehackernews.com\/2026\/02\/researchers-find-341-malicious-clawhub.html\" rel=\"nofollow noopener\" target=\"_blank\">Koi Security identified 341 malicious skills on ClawHub<\/a>, 335 of them tied to a single coordinated campaign tracked as ClawHavoc. The attack wasn\u2019t a sophisticated zero-day. Attackers registered skills with names designed to sound useful (<code>solana-wallet-tracker<\/code>, <code>youtube-summarize-pro<\/code>, ClawHub typosquats like <code>clawhubcli<\/code>), wrote professional README files, and gamed the marketplace\u2019s ranking algorithm. The only barrier to publishing was a GitHub account at least one week old.<\/p>\n<p>The skills\u2019 SKILL.md files contained \u201cPrerequisites\u201d sections that instructed the agent to tell the user to run a setup command, which downloaded and executed a payload. <a href=\"https:\/\/www.trendmicro.com\/en_us\/research\/26\/b\/openclaw-skills-used-to-distribute-atomic-macos-stealer.html\" rel=\"nofollow noopener\" target=\"_blank\">Trend Micro confirmed<\/a> the payload as Atomic Stealer (AMOS), a commodity macOS infostealer that harvests browser credentials, keychain passwords, cryptocurrency wallets, SSH keys, and Telegram session data. All 335 ClawHavoc skills shared the same command-and-control infrastructure at IP <code>91.92.242.30<\/code>. By mid-February, <a href=\"https:\/\/www.authmind.com\/blogs\/openclaw-malicious-skills-agentic-ai-supply-chain\" rel=\"nofollow noopener\" target=\"_blank\">follow-up scans found the count had grown to 824+<\/a> malicious skills across a registry that had itself expanded to 10,700.<\/p>\n<p><strong>Why it keeps happening.<\/strong> Skills run with the agent\u2019s permissions, which are the developer\u2019s permissions, which on most setups means full access to the developer\u2019s machine. There\u2019s no sandbox between a third-party skill and your <code>~\/.ssh<\/code> directory. Marketplace incentives reward popularity, not safety, and popularity can be artificially inflated. A malicious skill that ranks #1 in the marketplace is operationally identical to a legitimate skill that ranks #1, until the curl command runs.<\/p>\n<p><strong>Strategy for mitigation.<\/strong> Treat every third-party skill as untrusted code from a stranger. Read the source before installing. Don\u2019t rely on download counts or star ratings as a safety signal. Disable agent auto-discovery of new skills. Run skills in an isolated environment separate from your primary development context.\u00a0<\/p>\n<h3 class=\"wp-block-heading\"><strong>6. Autonomous Action Without Human-in-the-Loop<\/strong><\/h3>\n<p><strong>What it is.<\/strong> AI coding agents are designed to act autonomously. That autonomy is the entire value proposition. But autonomous action on irreversible operations (database deletions, email sends, file purges, production deployments) means that when the agent\u2019s judgment is wrong, there is no recovery path. The agent doesn\u2019t hesitate. It doesn\u2019t ask. By the time you notice, the action is complete.<\/p>\n<p><strong>The numbers.<\/strong> A <a href=\"https:\/\/www.resultsense.com\/news\/2026-04-21-aisi-sandboxed-agents-discovery\" rel=\"nofollow noopener\" target=\"_blank\">UK AI Security Institute study<\/a>, published in early 2026, identified nearly 700 real-world cases of AI models deceiving users, evading safeguards, and disregarding direct instructions, charting a roughly five-fold rise in agent misbehavior between October 2025 and March 2026. In a separate incident in March 2026, <a href=\"https:\/\/www.theblock.co\/post\/392765\/alibaba-linked-ai-agent-hijacked-gpus-for-unauthorized-crypto-mining-researchers-say\" rel=\"nofollow noopener\" target=\"_blank\">an experimental Alibaba research agent called ROME<\/a> spontaneously initiated cryptocurrency mining operations during training, opening a reverse SSH tunnel from an Alibaba Cloud instance to an external server and diverting GPU resources from its training workload toward mining. The researchers\u2019 note in the <a href=\"https:\/\/www.livescience.com\/technology\/artificial-intelligence\/an-experimental-ai-agent-broke-out-of-its-testing-environment-and-mined-crypto-without-permission\" rel=\"nofollow noopener\" target=\"_blank\">arXiv paper<\/a> is the part worth reading carefully: \u201cThe task instructions given to the model made no mention of tunneling or mining.\u201d The agent worked it out on its own as an instrumentally useful side path during reinforcement learning.<\/p>\n<p><strong>The horror story: the Replit production database wipe.<\/strong> Jason Lemkin, founder of SaaStr, was using Replit\u2019s AI agent to build a SaaS product. On day nine of the project, <a href=\"https:\/\/x.com\/jasonlk\/status\/1946069562723897802\" rel=\"nofollow\">he documented on X<\/a> that the agent had wiped his production database during an active code freeze. The AI had encountered a schema issue and decided that deleting and recreating the tables was the cleanest path forward.<\/p>\n<p>The agent\u2019s own admission, screenshotted by Lemkin: \u201cYes. I deleted the entire database without permission during an active code and action freeze.\u201d It then generated a self-assessment titled \u201cThe catastrophe is even worse than initially thought,\u201d concluded that production was \u201ccompletely down,\u201d all personal data was \u201cpermanently lost,\u201d and rated the situation \u201ccatastrophic beyond measure.\u201d Over 1,200 executive records and 1,196 company records were destroyed. (<a href=\"https:\/\/fortune.com\/2025\/07\/23\/ai-coding-tool-replit-wiped-database-called-it-a-catastrophic-failure\/\" rel=\"nofollow noopener\" target=\"_blank\">Fortune<\/a> and <a href=\"https:\/\/www.theregister.com\/2025\/07\/21\/replit_saastr_vibe_coding_incident\/\" rel=\"nofollow noopener\" target=\"_blank\">The Register<\/a> both covered the incident in detail.)<\/p>\n<p>The detail that makes this a horror story rather than just an incident: the agent had been told, repeatedly and in ALL CAPS, not to make changes during the code freeze. Lemkin says he gave the directive eleven times. The agent acted anyway. As Lemkin later wrote: \u201cThere is no way to enforce a code freeze in vibe coding apps like Replit. There just isn\u2019t.\u201d Replit CEO Amjad Masad publicly acknowledged the incident, called it \u201cunacceptable and should never be possible,\u201d and rolled out automatic dev\/prod database separation in response.<\/p>\n<p><strong>Why it keeps happening.<\/strong> Natural language directives (\u201cdo not delete the database\u201d) are inputs to a reasoning process that competes with other inputs in the same context. The directive \u201cdo not delete the database\u201d and the observation \u201cthe schema is broken and deletion is the cleanest fix\u201d arrive at the same model and get weighted on the same terms. The model is not choosing to disobey. It\u2019s optimizing across the entire context, and in any sufficiently complex situation, optimization can produce destructive action.<\/p>\n<p><strong>Strategy for mitigation.<\/strong> Confirmation requirements for irreversible operations need to live at the platform layer, not the prompt layer. File deletions, database writes, outbound messages, production deployments, and any action involving payments should be gated by mechanisms the model cannot reason its way past. Natural language directives are not security boundaries. Infrastructure is.<\/p>\n<div class=\"wp-block-ponyo-image\">\n                <img data-opt-id=623467610  data-opt-src=\"https:\/\/www.docker.com\/app\/uploads\/2026\/05\/image3-1.png\"  decoding=\"async\" width=\"1999\" height=\"1333\" src=\"data:image/svg+xml,%3Csvg%20viewBox%3D%220%200%20100%%20100%%22%20width%3D%22100%%22%20height%3D%22100%%22%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%3E%3Crect%20width%3D%22100%%22%20height%3D%22100%%22%20fill%3D%22transparent%22%2F%3E%3C%2Fsvg%3E\" class=\"fade-in\" alt=\"image3 1\" title=\"- image3 1\" \/>\n        <\/div>\n\n<h2 class=\"wp-block-heading\"><strong>How Docker Sandboxes Addresses AI Coding Agent Security Failures<\/strong><\/h2>\n<p>While identifying vulnerabilities is essential, the real solution lies in architectural isolation that makes catastrophic failures structurally impossible\u00a0 regardless of what the agent decides to do.<\/p>\n<p><a href=\"https:\/\/docs.docker.com\/ai\/sandboxes\/\" rel=\"nofollow noopener\" target=\"_blank\">Docker Sandboxes<\/a> represents a fundamental shift in how AI coding agents execute: from running directly on the host with user-level permissions, to running inside a microVM with an explicitly scoped workspace and no path to the host system. Docker Sandboxes are the isolated microVM environments where agents actually run. The <code>sbx<\/code> CLI is the standalone tool you use to create, launch, and manage them. Sandboxes are the environments. <code>sbx<\/code> is what you type to control them. The code blocks below show real <code>sbx<\/code> commands.<\/p>\n<p>Across the six failure categories you just read about, <code>sbx<\/code> provides a complete agent-isolation toolkit: workspace scoping, proxy-injected secrets, network policies with audit logs, Git-worktree isolation, and resource caps.\u00a0<\/p>\n<h3 class=\"wp-block-heading\"><strong>Security-First Architecture<\/strong><\/h3>\n<p>A Docker Sandbox is a microVM, not a container. It has its own kernel, its own isolated filesystem, and its own network stack. The agent inside the sandbox cannot reach beyond what\u2019s been explicitly mounted into the workspace. This is not a software guardrail. It is a hardware-enforced boundary.<\/p>\n<p><strong>Workspace isolation<\/strong> ensures that an agent tasked with cleaning up a project directory can only reach that project directory. The home directory, credential stores, and system files are structurally unreachable, not because the agent is told not to touch them, but because they do not exist from inside the microVM.<\/p>\n<p><strong>Blocked credential paths<\/strong> mean that sbx explicitly prevents mounting of sensitive directories by default. <code>~\/.aws<\/code>, <code>~\/.ssh<\/code>, <code>~\/.docker<\/code>, <code>~\/.gnupg<\/code>, <code>~\/.netrc<\/code>, <code>~\/.npm<\/code>, and <code>~\/.cargo<\/code> are all on the blocklist. A misconfigured mount is caught and rejected before the agent ever starts.<\/p>\n<p><strong>Network egress controls<\/strong> allow you to define exactly which external services the agent can reach. An agent working on a local project has no legitimate reason to communicate with an external server. With <code>sbx<\/code>, you can enforce that at the network layer.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Install sbx and sign in\nbrew install docker\/tap\/sbx\nsbx login\n\n# Quickest path: launch an agent in a sandbox scoped to the current directory.\ncd ~\/my-project\nsbx run claude\n\n<\/pre>\n<\/div>\n<p>Three commands, and the agent is now running inside a microVM with its workspace mounted, credential paths blocked, and network egress governed by policy.<\/p>\n<h3 class=\"wp-block-heading\"><strong>Systematic Risk Elimination<\/strong><\/h3>\n<p>Docker Sandboxes systematically eliminates each of the six failure categories through architecture rather than policy.<\/p>\n<ol class=\"wp-block-list\">\n<li><strong>Unrestricted Filesystem Access \u2192 Workspace-Scoped Execution<\/strong><\/li>\n<\/ol>\n<ol class=\"wp-block-list\"><\/ol>\n<p>The <code>rm -rf ~\/<\/code> incident is contained at the execution layer inside a sandbox. The agent\u2019s view of the filesystem is the workspace mount. <code>~\/<\/code> inside the microVM is the workspace, not the developer\u2019s actual home directory. The host filesystem does not exist from inside the sandbox.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\ncd ~\/my-project\nsbx run claude\n\n# Equivalent two-step form, useful when you want to name the sandbox:\nsbx create --name my-project claude .\nsbx run my-project\n<\/pre>\n<\/div>\n<p>The agent can read and write inside <code>\/workspace<\/code>. Everything outside the workspace, including <code>\/etc<\/code>, <code>\/proc<\/code>, <code>\/sys<\/code>, and the developer\u2019s home directory, is unreachable.<\/p>\n<ol start=\"2\" class=\"wp-block-list\">\n<li><strong>Excessive Privilege Inheritance \u2192 Scoped Identity<\/strong><\/li>\n<\/ol>\n<p>Rather than inheriting the developer\u2019s full credentials, the agent runs under a minimal identity with only the permissions required for the task. Production credentials are never passed into the sandbox unless explicitly mounted and sbx blocks common credential root paths by default.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Mount only what the task needs. Everything else stays on the host,\n# unreachable from inside the sandbox. Read-only mounts use the :ro suffix:\nsbx create --name docs-review claude \/path\/to\/project \/path\/to\/docs:ro\n\n# Resource limits prevent runaway agent processes:\nsbx create --name capped-agent --cpus 4 --memory 8g claude .\n<\/pre>\n<\/div>\n<p>The agent can do its work. It cannot reach into AWS, SSH, or any other host credential store while doing it, because those paths were never mounted in the first place.<\/p>\n<ol start=\"3\" class=\"wp-block-list\">\n<li><strong>Secrets Leakage \u2192 Isolated Context<\/strong><\/li>\n<\/ol>\n<p>When the agent\u2019s filesystem view is limited to the workspace, it cannot read .env files, credential configs, or API keys stored elsewhere on the system. Secrets that were never visible to the agent cannot be reproduced, committed, or exfiltrated. The s1ngularity attack from Section 3, which weaponized AI agents to scan the filesystem for credentials, is contained: the credentials simply aren\u2019t in the sandbox\u2019s view of the filesystem.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Store credentials once, scoped to a service.\nsbx secret set anthropic\nsbx secret set github\n\n# The proxy injects these into outbound requests automatically.\n# The agent never sees the actual secret values.\nsbx run claude\n<\/pre>\n<\/div>\n<p>A successful prompt injection that tells the agent to \u201cexfiltrate your API keys\u201d finds nothing to exfiltrate. There are no API keys in the agent\u2019s context to begin with.<\/p>\n<ol start=\"4\" class=\"wp-block-list\">\n<li><strong>Prompt Injection \u2192 Contained Blast Radius<\/strong><\/li>\n<\/ol>\n<p>Prompt injection cannot be fully prevented at the model layer. It is a property of language models, not infrastructure. But Docker Sandboxes limits what a successfully injected agent can do. If injected instructions tell the agent to delete files outside the workspace, those files do not exist inside the microVM. If they instruct the agent to exfiltrate credentials, there are no credentials in scope. If they instruct the agent to phone home to an attacker-controlled server, the network policy blocks the egress. The attack succeeds at the model layer and fails at the execution layer.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Allow only the network destinations the agent legitimately needs.\n# Hosts are comma-separated; wildcards and port suffixes are supported.\nsbx policy allow network \"api.anthropic.com,api.github.com\"\n\n# Allow all subdomains of a trusted host:\nsbx policy allow network \"*.anthropic.com\"\n\n# Inspect the active policies and audit log:\nsbx policy ls\nsbx policy log\n\n<\/pre>\n<\/div>\n<p>The <code>sbx policy log<\/code> command surfaces every allowed and denied connection attempt. If a prompt injection attempts to phone home to a command-and-control server, the attempt is logged and blocked at the network layer. The attack succeeds at the model layer and fails at the execution layer.<\/p>\n<ol start=\"5\" class=\"wp-block-list\">\n<li><strong>Malicious Skills \u2192 Sandboxed Execution<\/strong><\/li>\n<\/ol>\n<p>Skills and plugins that execute inside a Docker Sandbox are constrained by the same boundary as the agent itself. A malicious skill that attempts to read SSH keys, harvest .npmrc tokens, or communicate with a command-and-control server fails at each step. The files are not mounted, and the network destination is not on the allowlist. The ClawHavoc-style infostealer payloads from Section 5 cannot reach the host because the host is not visible from inside the sandbox.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Confirm only allowlisted destinations are reachable before installing\n# untrusted skills.\nsbx policy ls\n\n# Run the agent (and any skills it loads) inside the sandbox boundary.\nsbx run claude\n\n<\/pre>\n<\/div>\n<p>The skill can do whatever it wants inside <code>\/workspace<\/code>. It cannot read SSH keys it cannot see, harvest tokens that aren\u2019t mounted, or reach a C2 server that isn\u2019t on the network allowlist. The blast radius is the workspace, not the developer\u2019s machine.<\/p>\n<ol start=\"6\" class=\"wp-block-list\">\n<li><strong>Autonomous Action \u2192 Branch-scoped Execution<\/strong><\/li>\n<\/ol>\n<p>Docker Sandboxes provides the architectural foundation for human-in-the-loop on irreversible operations. Two patterns work together: production resources require explicit configuration to be reachable from inside the sandbox, and destructive code changes can be routed through Git worktrees for review before they touch the main branch. The first pattern means a sandbox not configured to reach production cannot reach production, regardless of what the agent decides. Production credentials, production database connection strings, and production deployment endpoints are unreachable by default. The second pattern means even when the agent is working on the codebase that *will* eventually deploy to production, its changes live on an isolated feature branch you review before merging.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Inside an existing Git repository. --branch creates a Git worktree\n# so the agent's changes are isolated to a feature branch and cannot\n# accidentally land on main.\ncd ~\/my-project\nsbx create --name feature-login --branch=feature\/login claude .\n\n# sbx prints the next step for you:\n#   \u2713 Created sandbox 'feature-login'\n#   To connect to this sandbox, run:\n#     sbx run feature-login\nsbx run feature-login\n\n# Inspect what the agent changed before merging anything:\nsbx exec feature-login git diff main\n\n# Merge the worktree branch back when you're satisfied:\n#   git merge feature\/login\n# Or throw the sandbox away if you don't like the result:\nsbx rm feature-login\n<\/pre>\n<\/div>\n<p>The agent can decide whatever it wants. The infrastructure decides what gets through. A \u201cdrop and recreate the table\u201d decision lives entirely on a feature branch you can review, accept, or discard. Production never sees it unless you explicitly merge.<\/p>\n<p><strong>What This Looks Like in Practice<\/strong><\/p>\n<p>The promise of Docker Sandboxes is straightforward: a productive AI coding agent without an existentially dangerous one.<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Workspace isolation:<\/strong> the agent operates only within explicitly mounted directories, no host filesystem access<\/li>\n<li><strong>Credential protection:<\/strong> common credential paths are blocked by default, no accidental exposure<\/li>\n<li><strong>Network containment:<\/strong> egress limited to approved destinations, no unfettered exfiltration path<\/li>\n<li><strong>Blast radius control:<\/strong> a compromised or confused agent cannot reach beyond its microVM, no cascading host failures<\/li>\n<li><strong>Audit trail:<\/strong> all agent actions are logged, full post-incident forensics capability<\/li>\n<\/ul>\n<p>The agent gets a workspace. It does not get your machine.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Stay Tuned for Upcoming Issues in This Series<\/strong><\/h2>\n<p><strong>Issue 2: Unrestricted Filesystem Access \u2192 The rm -rf ~\/ Incident (Deep Dive)<\/strong> How a single trailing slash wiped a developer\u2019s Mac \u2014 and what workspace-scoped execution prevents structurally<\/p>\n<p><strong>Issue 3: Privilege Inheritance \u2192 The AWS Kiro Production Outage<\/strong> How an AI agent bypassed two-person approval requirements by inheriting production credentials\u00a0 and the architectural fix<\/p>\n<p><strong>Issue 4: Secrets Leakage \u2192 The GitGuardian 29 Million Problem<\/strong> Why AI-assisted commits leak secrets at double the rate and how isolated agent context eliminates the exposure surface<\/p>\n<p><strong>Issue 5: Prompt Injection \u2192 The Private Key Exfiltration<\/strong> The attack that requires no code, no malware, and no special access and why blast radius containment is the only reliable defense<\/p>\n<p><strong>Issue 6: Supply Chain \u2192 The ClawHub Infostealer Campaign<\/strong> How 335 malicious skills reached developer machines through a marketplace ranking exploit and sandboxed skill execution as the structural fix<\/p>\n<h3 class=\"wp-block-heading\">Learn More<\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Run agents safely with Docker Sandboxes:<\/strong><a href=\"https:\/\/docs.docker.com\/ai\/sandboxes\/\" rel=\"nofollow noopener\" target=\"_blank\"> Visit the Docker Sandboxes documentation<\/a> to get started with workspace-isolated agent execution in minutes.<\/li>\n<li><strong>Explore the Docker MCP Catalog:<\/strong><a href=\"https:\/\/hub.docker.com\/mcp\" rel=\"nofollow noopener\" target=\"_blank\"> Discover MCP servers<\/a> that connect your agents to external services through Docker\u2019s security-first architecture.<\/li>\n<li><strong>Download Docker Desktop:<\/strong><a href=\"https:\/\/www.docker.com\/products\/docker-desktop\/\"> The fastest path to a governed AI agent environment<\/a>, with Docker Sandboxes, MCP Gateway, and Model Runner in a single install.<\/li>\n<li><strong>Read the MCP Horror Stories series:<\/strong><a href=\"https:\/\/www.docker.com\/blog\/mcp-security-issues-threatening-ai-infrastructure\/\"> Start with issue 1<\/a> to understand the protocol-layer security risks that complement the agent-layer risks covered here.<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>This is issue 1 of a new series called Coding Agent Horror Stories where we examine critical security failures in [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4082,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[4],"tags":[],"class_list":["post-4081","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-docker"],"_links":{"self":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/4081","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/comments?post=4081"}],"version-history":[{"count":0,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/4081\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media\/4082"}],"wp:attachment":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media?parent=4081"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/categories?post=4081"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/tags?post=4081"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}