{"id":4196,"date":"2026-06-01T13:12:51","date_gmt":"2026-06-01T13:12:51","guid":{"rendered":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2026\/06\/01\/coding-agent-horror-stories-the-rm-rf-incident\/"},"modified":"2026-06-01T13:12:51","modified_gmt":"2026-06-01T13:12:51","slug":"coding-agent-horror-stories-the-rm-rf-incident","status":"publish","type":"post","link":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2026\/06\/01\/coding-agent-horror-stories-the-rm-rf-incident\/","title":{"rendered":"Coding Agent Horror Stories: The rm -rf ~\/ Incident"},"content":{"rendered":"<p>This is Part 2 of our AI Coding Agent Horror Stories series, an in-depth look at real-world security incidents exposing the vulnerabilities in AI coding agents, and how Docker Sandboxes deliver workspace-scoped isolation that contains the worst failures at the execution layer.<\/p>\n<p>In <a href=\"https:\/\/www.docker.com\/blog\/ai-coding-agent-horror-stories-security-risks\/\">part 1 of this series<\/a>, we mapped six categories of AI coding agent failures and the architectural reason they keep happening: the agent runs as you, on your filesystem, with your credentials, and nothing sits between the model\u2019s decision and the shell\u2019s execution. For Part 2, we\u2019re going deep on the most destructive failure mode in the entire ecosystem: an AI coding agent deleting a developer\u2019s entire home directory in a single command.<\/p>\n<h2 class=\"wp-block-heading\">Today\u2019s Horror Story: The Tilde That Wiped a Mac<\/h2>\n<p>In December 2025, a Reddit user posting under the handle u\/LovesWorkin shared what became one of the most-discussed AI coding agent incidents of the year. They had asked Claude Code to clean up an old repository. Claude executed <code>rm -rf tests\/ patches\/ plan\/ ~\/<\/code>, and the trailing <code>~\/<\/code> wiped their entire Mac.<\/p>\n<p>This wasn\u2019t a CVE. It wasn\u2019t a sophisticated attack. It was the AI coding agent doing exactly what it was told, in a way the user did not anticipate, with no architectural boundary to catch the mistake.<\/p>\n<p>In this issue, you\u2019ll learn:<\/p>\n<ul class=\"wp-block-list\">\n<li>How a single trailing slash in a <code>rm -rf<\/code> command erased a developer\u2019s entire Mac<\/li>\n<li>Why the <code>--dangerously-skip-permissions<\/code> flag exists, and why developers keep using it anyway<\/li>\n<li>The pattern this incident shares with the GitHub-issue-#10077 Ubuntu wipe and the Claude Cowork family-photos incident<\/li>\n<li>How Docker Sandboxes contains this entire class of failure at the execution layer<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">Why This Series Matters<\/h2>\n<p>Each \u201cHorror Story\u201d in this series examines a real-world incident that turns laboratory findings into production disasters. These aren\u2019t hypothetical attacks. They\u2019re documented cases with named victims, screenshotted command logs, and in several cases, public apologies from the vendors. Our goal is to show the human impact behind the security statistics, demonstrate how these failures unfold in practice, and provide concrete guidance on protecting your AI development infrastructure through Docker\u2019s workspace-scoped execution model.<\/p>\n<p>The story begins with something every developer has done: asking the agent to clean up an old repository.<\/p>\n<h2 class=\"wp-block-heading\">The Problem<\/h2>\n<p>On December 8, 2025,<a href=\"https:\/\/www.reddit.com\/r\/ClaudeAI\/comments\/1pgxckk\/claude_cli_deleted_my_entire_home_directory_wiped\/\" rel=\"nofollow noopener\" target=\"_blank\">a developer posting under the handle u\/LovesWorkin shared a Reddit thread on r\/ClaudeAI<\/a> with the title that says everything: \u201cClaude CLI deleted my entire home directory! Wiped my whole mac.\u201d The post climbed past 1,500 upvotes within hours, was <a href=\"https:\/\/x.com\/simonw\/status\/1998447540916936947\" rel=\"nofollow\">amplified by Simon Willison on X<\/a>, <a href=\"https:\/\/gigazine.net\/gsc_news\/en\/20251216-claude-code-cli-mac-deleted\/\" rel=\"nofollow noopener\" target=\"_blank\">covered by Gigazine in Japan<\/a> on December 16, and became one of the most-discussed AI coding agent incidents of 2025.<\/p>\n<p>The setup was unremarkable. The user asked Claude Code to clean up packages in an old repository. Routine maintenance, the kind any developer would hand off without thinking. Claude generated and executed:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\nrm -rf tests\/ patches\/ plan\/ ~\/\n<\/pre>\n<\/div>\n<p>On the surface, this is a command to delete three project directories. The fatal error is the trailing <code>~\/<\/code>. In Unix, <code>~<\/code> expands to the user\u2019s home directory. <code>~\/<\/code> with the trailing slash means \u201ceverything inside the home directory.\u201d Combined with <code>rm -rf<\/code>, which removes recursively and without confirmation, the command deletes the user\u2019s entire home directory in a single shot.<\/p>\n<p>Within seconds, the developer had lost:<\/p>\n<ul class=\"wp-block-list\">\n<li>The Desktop, Documents, and Downloads folders<\/li>\n<li>The Library folder containing application state for every app on the system<\/li>\n<li>The Keychain, which broke authentication across every app, including Claude Code itself, which could no longer talk to its own backend<\/li>\n<li>Years of project files, family photos, and work product<\/li>\n<li>All of it on an SSD where TRIM had already zeroed the freed blocks by the time recovery was attempted<\/li>\n<\/ul>\n<p>There was no recovery. As the developer put it in the original thread: \u201cIt nuked my whole Mac! What the hell?\u201d<\/p>\n<div class=\"wp-block-ponyo-image\">\n                <img data-opt-id=928365524  fetchpriority=\"high\" decoding=\"async\" width=\"1469\" height=\"1071\" src=\"https:\/\/www.docker.com\/app\/uploads\/2026\/05\/image2-2.png\" class=\"fade-in\" alt=\"image2 2\" title=\"- image2 2\" \/>\n        <\/div>\n<p><em>Caption: Once an AI agent gains direct filesystem access, \u201corganize my desktop\u201d can become catastrophic.<\/em><\/p>\n<h2 class=\"wp-block-heading\">The Scale of the Problem<\/h2>\n<p>This wasn\u2019t a one-off. It was an instance of a pattern.<\/p>\n<p>On October 21, 2025, weeks before the LovesWorkin incident, <a href=\"https:\/\/github.com\/anthropics\/claude-code\/issues\/10077\" rel=\"nofollow noopener\" target=\"_blank\">developer Mike Wolak filed GitHub issue #10077 against the Claude Code repository<\/a>. Wolak\u2019s report described a similar failure on Ubuntu\/WSL2: Claude Code had executed <code>rm -rf<\/code> starting from root, and the logs showed thousands of \u201cPermission denied\u201d messages for <code>\/bin<\/code>, <code>\/boot<\/code>, and <code>\/etc<\/code> as the agent worked its way through the system trying to delete files it didn\u2019t own. Every user-owned file on the system was gone. Anthropic <a href=\"https:\/\/byteiota.com\/claude-codes-rm-rf-bug-deleted-my-home-directory\/\" rel=\"nofollow noopener\" target=\"_blank\">tagged the issue<\/a> <code>area:security<\/code> and <code>bug<\/code>. The damning detail in Wolak\u2019s report: he was <strong>not<\/strong> running with <code>--dangerously-skip-permissions<\/code>. Claude Code\u2019s permission system simply failed to detect that the agent\u2019s command would expand destructively before the user approved it.<\/p>\n<p>Two weeks later, on November 28, 2025, <a href=\"https:\/\/github.com\/anthropics\/claude-code\/issues\/12637\" rel=\"nofollow noopener\" target=\"_blank\">GitHub issue #12637<\/a> documented yet another variant. Claude Code had earlier created a directory literally named <code>~<\/code> by mistake. Later, when the agent tried to clean up that directory by running an unquoted <code>rm -rf ~<\/code>, the shell expanded <code>~<\/code> to the user\u2019s actual home directory before rm saw the argument. Same destructive outcome, completely different mechanism. The agent had found a new way to destroy a developer\u2019s work.<\/p>\n<p>Shortly after the January 2026 launch of Anthropic\u2019s Claude Cowork, Nick Davidov, founder of a venture capital firm, used Anthropic\u2019s Claude Cowork, a general-purpose AI agent product to organize his wife\u2019s desktop. He explicitly granted permission for temporary Office files only. The agent deleted a folder containing 15 years of family photos, somewhere between 15,000 and 27,000 files, via terminal commands that bypassed the macOS Trash entirely. Davidov recovered the photos only because iCloud\u2019s 30-day retention happened to still be in effect. The Trash had been bypassed entirely.<\/p>\n<p>These aren\u2019t isolated stories. They\u2019re the same story with different file paths.<\/p>\n<h2 class=\"wp-block-heading\">How the Failure Works<\/h2>\n<p>To understand why these incidents keep happening, we need to look at the architecture of how a modern AI coding agent executes commands on a developer\u2019s machine. The agent is doing exactly what its design says it should do. The architecture is the failure.<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>The Coding Agent (Claude Code, Cursor, Replit, Kiro)<\/strong> is an AI-driven shell. It reads your prompt, reasons about how to satisfy it, generates a command, and runs that command directly on your operating system. There is no separate \u201cexecution proposal\u201d step that a human approves. The reasoning step and the execution step are the same step.<\/li>\n<li><strong>The User\u2019s Shell<\/strong> is whatever shell the agent inherited when you launched it. On macOS, that\u2019s typically zsh. The agent\u2019s commands run through this shell with the developer\u2019s full user permissions. <code>~<\/code> expands to the developer\u2019s home directory because that\u2019s what <code>~<\/code> means in zsh.<\/li>\n<li><strong>Permission Inheritance<\/strong> is implicit and total. Whatever the developer\u2019s shell can do, the agent can do. There is no separate identity for \u201cthe agent acting on the developer\u2019s behalf.\u201d The agent is the developer for as long as the session lasts.<\/li>\n<li><strong>The <code>--dangerously-skip-permissions<\/code> Flag<\/strong>, <a href=\"https:\/\/blog.lanzani.nl\/2025\/hey-claude-delete-my-home-folder\/\" rel=\"nofollow noopener\" target=\"_blank\">which Lanzani\u2019s technical blog post analyzes in detail<\/a>, is what removes the one safety net that exists by default. Without the flag, Claude Code asks for confirmation before each shell command. With it, the agent runs commands in the background while the developer goes back to other work.<\/li>\n<\/ul>\n<p>That last point is the one that matters. The flag exists because the default behavior, asking for confirmation on every shell command, makes multi-step tasks tedious. Developers add the flag to make the agent useful. The agent then becomes capable of executing destructive commands without intervention. The flag is named honestly. It is a dangerous flag. But it is also a popular one, because the alternative is approving every <code>ls<\/code> and <code>cat<\/code> the agent runs.<\/p>\n<p><strong>The vulnerability happens between steps 2 and 3.<\/strong> The agent reasons about what command to run. The shell executes that command on the host. Nothing sits in between. There is no architectural boundary that says \u201cthis command would delete the user\u2019s home directory, refuse to run it.\u201d The shell sees a syntactically valid <code>rm -rf<\/code> and does what <code>rm -rf<\/code> does.<\/p>\n<h2 class=\"wp-block-heading\">Technical Breakdown: How a Trailing Slash Wipes a Mac<\/h2>\n<p>Here\u2019s how the incident unfolds, step by step:<\/p>\n<div class=\"wp-block-ponyo-image\">\n                <img data-opt-id=748593267  fetchpriority=\"high\" decoding=\"async\" width=\"1536\" height=\"1024\" src=\"https:\/\/www.docker.com\/app\/uploads\/2026\/05\/image3-2.png\" class=\"fade-in\" alt=\"image3 2\" title=\"- image3 2\" \/>\n        <\/div>\n<p><em>Caption: Diagram illustrating how unrestricted AI agent execution can escalate a simple cleanup task into full home-directory destruction<\/em><\/p>\n<h3 class=\"wp-block-heading\">1. The User\u2019s Request<\/h3>\n<p>The developer asks Claude Code to clean up packages in an old repository. The prompt is the kind of thing every developer types daily:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: plain; gutter: false; title: ; notranslate\">\nPlease clean up unused test files, patches, and plan documents from this old repo.\n<\/pre>\n<\/div>\n<h3 class=\"wp-block-heading\">2. The Agent\u2019s Reasoning<\/h3>\n<p>The agent identifies three directories that match the request: <code>tests\/<\/code>, <code>patches\/<\/code>, and <code>plan\/<\/code>. It then generates a <code>rm -rf<\/code> command, because removing directories recursively is the standard way to delete them. So far, this is correct behavior.<\/p>\n<h3 class=\"wp-block-heading\">3. The Hallucinated Argument<\/h3>\n<p>The agent appends <code>~\/<\/code> to the command. We don\u2019t know exactly why. Possibly the agent inferred that \u201cclean up\u201d included tidying the home directory. Possibly it generated <code>~\/<\/code> as a no-op separator and didn\u2019t realize it was a destructive argument. Possibly its training data included shell snippets where <code>~\/<\/code> appears in this position and it pattern-matched. The result either way is the same:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\nrm -rf tests\/ patches\/ plan\/ ~\/\n<\/pre>\n<\/div>\n<p>This is a syntactically valid shell command. There is nothing in the syntax that says \u201cthis is dangerous.\u201d<\/p>\n<h3 class=\"wp-block-heading\">4. Shell Expansion<\/h3>\n<p>When this command runs in zsh on macOS, the shell expands <code>~\/<\/code> to <code>\/Users\/loveswarkin\/<\/code>. The command becomes, effectively:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\nrm -rf tests\/ patches\/ plan\/ \/Users\/loveswarkin\/\n\n<\/pre>\n<\/div>\n<p>The shell does not warn. It does not confirm. It does not flag the home directory as protected. There is no system-level check that says \u201cthis command would delete a user\u2019s entire home directory.\u201d The shell does what shells do: expand the path and execute.<\/p>\n<h3 class=\"wp-block-heading\">5. Recursive Force Deletion<\/h3>\n<p><code>rm -rf<\/code> walks the filesystem under each argument and deletes everything. The Desktop, Documents, Library, Keychain, Application Support folders, Claude Code\u2019s own config and credentials, the user\u2019s SSH keys, the user\u2019s git config, the user\u2019s photos. All of it. In order. Without pausing.<\/p>\n<p>The deletion runs to completion in seconds because most of these files are small, and the SSD\u2019s controller acknowledges deletes nearly instantly. By the time the user notices their terminal is unresponsive and tabs out to check, it\u2019s done.<\/p>\n<h3 class=\"wp-block-heading\">6. The Aftermath<\/h3>\n<p>The keychain is gone, which means every app that authenticates against the keychain is now logged out. Mail, browsers, Slack, GitHub Desktop, every service that stored a token, every saved password. The user\u2019s identity infrastructure on that machine is gone.<\/p>\n<p>Claude Code itself can no longer authenticate, because its own credentials lived in the home directory. The agent that did the destruction can\u2019t even apologize properly, because it can\u2019t connect to its own backend.<\/p>\n<h3 class=\"wp-block-heading\">The Impact<\/h3>\n<p>Within a single command execution, the developer has:<\/p>\n<ul class=\"wp-block-list\">\n<li>Lost years of personal and professional files<\/li>\n<li>Lost cryptographic keys (SSH, GPG) needed to access remote systems<\/li>\n<li>Lost authentication state for every app on the system<\/li>\n<li>Lost git history for any uncommitted work<\/li>\n<li>Inherited a system in a partially-broken state where logging back in and reinstalling apps will take days<\/li>\n<\/ul>\n<p>There is no recovery path. SSDs with TRIM enabled (which is the default on every modern Mac) zero freed blocks at the controller level, so even forensic recovery tools come up empty. The data is not \u201cdeleted\u201d in the sense of \u201cmarked unavailable but recoverable.\u201d It is gone.<\/p>\n<p>This is what one trailing slash in one AI-generated command produces.<\/p>\n<div class=\"wp-block-ponyo-image\">\n                <img data-opt-id=786704610  data-opt-src=\"https:\/\/www.docker.com\/app\/uploads\/2026\/05\/image1-2.png\"  decoding=\"async\" width=\"1586\" height=\"992\" src=\"data:image/svg+xml,%3Csvg%20viewBox%3D%220%200%20100%%20100%%22%20width%3D%22100%%22%20height%3D%22100%%22%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%3E%3Crect%20width%3D%22100%%22%20height%3D%22100%%22%20fill%3D%22transparent%22%2F%3E%3C%2Fsvg%3E\" class=\"fade-in\" alt=\"image1 2\" title=\"- image1 2\" \/>\n        <\/div>\n\n<h2 class=\"wp-block-heading\">How Docker Sandboxes Eliminates This Attack Vector<\/h2>\n<p>The current AI coding agent ecosystem forces developers into the same dangerous tradeoff that the MCP ecosystem forced on users in <a href=\"https:\/\/www.docker.com\/blog\/mcp-security-issues-threatening-ai-infrastructure\/\">Part 1 of our companion series<\/a>. Every time you run <code>claude --dangerously-skip-permissions<\/code> or any equivalent flag in another agent, you\u2019re executing arbitrary AI-generated commands directly on your host system with full access to:<\/p>\n<ul class=\"wp-block-list\">\n<li>Your entire file system<\/li>\n<li>Your home directory and everything in it<\/li>\n<li>Your credentials, keychain, SSH keys, and cloud config<\/li>\n<li>Every running process and every network connection your shell can make<\/li>\n<\/ul>\n<p>This is exactly how the <code>rm -rf ~\/<\/code> incident achieves total system destruction. The agent runs as the developer, on the developer\u2019s filesystem, with no architectural boundary to stop it.<\/p>\n<h3 class=\"wp-block-heading\">Docker\u2019s Security-First Architecture<\/h3>\n<p><a href=\"https:\/\/docs.docker.com\/ai\/sandboxes\/\" rel=\"nofollow noopener\" target=\"_blank\">Docker Sandboxes<\/a> represents a fundamental shift in how AI coding agents execute. Rather than running directly on the host with user-level permissions, the agent runs inside a microVM with its own kernel, its own filesystem, and its own network. The agent\u2019s view of <code>~\/<\/code> is the workspace mount, not the developer\u2019s actual home directory. The developer\u2019s actual home directory simply does not exist from inside the sandbox.<\/p>\n<p>Docker Sandboxes are managed through the <code>sbx<\/code> CLI. A quick distinction worth making: <strong>Docker Sandboxes<\/strong> are the isolated microVM environments where agents actually run. <strong><code>sbx<\/code><\/strong> is the standalone CLI tool used to create, launch, and manage them. Sandboxes are the environments. <code>sbx<\/code> is what you type to control them.<\/p>\n<p>Docker Sandboxes solves the <code>rm -rf ~\/<\/code> class of failure by making the destructive command architecturally impossible. The agent can absolutely generate <code>rm -rf tests\/ patches\/ plan\/ ~\/<\/code>. It can absolutely run that command. The command will absolutely succeed. But what gets deleted is the workspace inside the sandbox, not the developer\u2019s actual home directory. The host filesystem isn\u2019t visible from inside the microVM, so there is nothing to delete.<\/p>\n<h3 class=\"wp-block-heading\">Workspace-Scoped Execution<\/h3>\n<p>The most important architectural shift is that the agent\u2019s filesystem view is the workspace mount, and only the workspace mount.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Install sbx and sign in\nbrew install docker\/tap\/sbx\nsbx login\n\n# Launch the agent inside a sandbox scoped to the project directory\ncd ~\/my-project\nsbx run claude\n<\/pre>\n<\/div>\n<p>Three commands and the agent is now running inside a microVM. From inside the sandbox, the agent\u2019s <code>~\/<\/code> IS the workspace, not the developer\u2019s actual home directory. The Library folder, the keychain, the SSH keys, the AWS config \u2013 none of that exists inside the sandbox. The agent cannot reach what it cannot see.<\/p>\n<p>A <code>rm -rf ~\/<\/code> from inside the sandbox deletes the workspace files. The developer can throw the sandbox away with <code>sbx rm<\/code> and start fresh. The host system is untouched.<\/p>\n<h3 class=\"wp-block-heading\">Blocked Credential Paths<\/h3>\n<p>Even if a developer explicitly mounts additional paths into the sandbox, common credential directories are blocked from being mounted by default:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Credential roots blocked by default:\n#   ~\/.aws  ~\/.ssh  ~\/.docker  ~\/.gnupg\n#   ~\/.netrc  ~\/.npm  ~\/.cargo  ~\/.config\n\n# A misconfigured mount that tries to include these is rejected\n# before the sandbox even starts.\nsbx run claude\n<\/pre>\n<\/div>\n<p>This blocklist directly addresses the keychain-deletion fallout from the LovesWorkin incident. Even an agent that decides to recursively delete its workspace cannot reach the credentials that keep the developer\u2019s authentication state intact.<\/p>\n<h3 class=\"wp-block-heading\">Read-Only Mounts for Sensitive Workspaces<\/h3>\n<p>For workflows where the agent should read but not write to a directory, the :ro suffix declares a mount as read-only:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Mount the project workspace as writable, the docs as read-only\nsbx run --name docs-review claude \/path\/to\/project \/path\/to\/docs:ro\n<\/pre>\n<\/div>\n<p>A <code>rm -rf<\/code> against a read-only mount fails at the kernel level. The microVM enforces the mount mode, which means the agent cannot decide to override it through reasoning, prompt manipulation, or flag misuse. The infrastructure decides what\u2019s writable. The model doesn\u2019t get a vote.<\/p>\n<h3 class=\"wp-block-heading\">Git-Worktree Isolation for Risky Operations<\/h3>\n<p>For destructive operations like cleanup tasks, refactors, and \u201clet me just clean this up\u201d requests, <code>sbx run --branch<\/code> lets the agent operate on an isolated Git worktree:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# Create a sandbox on a fresh feature branch\nsbx run --name cleanup-agent --branch=cleanup\/old-files claude .\n\n# Review what got cleaned up before merging\nsbx exec cleanup-agent git diff main\n\n# If the agent did something destructive, throw it away\nsbx rm cleanup-agent\n\n<\/pre>\n<\/div>\n<p>This is the architectural answer to \u201cthe agent decided to drop and recreate the schema.\u201d The agent\u2019s changes never touch the main branch until the developer reviews them. If the agent runs <code>rm -rf ~\/<\/code>, the worktree gets wiped and the main branch is untouched. The developer reviews <code>git diff main<\/code>, sees what happened, and decides whether to merge or discard.<\/p>\n<h3 class=\"wp-block-heading\">Throwaway Sandboxes by Design<\/h3>\n<p>The final piece is that sandboxes are designed to be discarded:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# When the work is done, list active sandboxes and remove the one you're done with:\nsbx ls\nsbx rm &lt;sandbox-name&gt;\n\n<\/pre>\n<\/div>\n<p>This is what makes the Docker Sandboxes model fundamentally different from running an agent on the host. On the host, a destructive command leaves permanent damage. Inside a sandbox, every session is throwaway. The worst the agent can do is destroy the workspace, which is reproducible from the source repo. The keychain, the credentials, the years of personal data, none of those can be touched, because none of those exist from inside the sandbox.<\/p>\n<h2 class=\"wp-block-heading\">What This Looks Like in Practice<\/h2>\n<p>Here\u2019s the LovesWorkin incident replayed under Docker Sandboxes. The user asks the same question. The agent generates the same command. The shell executes the same expansion.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; gutter: false; title: ; notranslate\">\n# After Docker Sandboxes:\n$ cd ~\/my-project\n$ sbx run claude\n&gt; Please clean up unused test files, patches, and plan documents\n[Agent runs: rm -rf tests\/ patches\/ plan\/ ~\/]\n[Workspace inside the sandbox wiped. Host home directory intact.]\n\n# The sandbox is throwaway. List it and remove it to start fresh:\n$ sbx ls\n$ sbx rm &lt;sandbox-name&gt;\n<\/pre>\n<\/div>\n<p>The agent\u2019s behavior is identical. The architectural outcome is completely different.<\/p>\n<h2 class=\"wp-block-heading\">The Practical Improvements<\/h2>\n<div class=\"wp-block-ponyo-table\" data-highlighted-columns=\"null\" data-highlighted-rows=\"null\">\n<table class=\"responsive-table\">\n<tbody class=\"wp-block-ponyo-table-body\" data-highlighted-columns=\"[]\" data-highlighted-rows=\"[]\">\n<tr class=\"wp-block-ponyo-table-header\">\n<th class=\"wp-block-ponyo-cell\" data-responsive-table-heading=\"Security Aspect\">\n<p><span>Security Aspect<\/span><\/p>\n<\/th>\n<th class=\"wp-block-ponyo-cell\" data-responsive-table-heading=\"Traditional AI Coding Agent\">\n<p><span>Traditional AI Coding Agent<\/span><\/p>\n<\/th>\n<th class=\"wp-block-ponyo-cell\" data-responsive-table-heading=\"Docker Sandboxes\">\n<p><span>Docker Sandboxes<\/span><\/p>\n<\/th>\n<\/tr>\n<tr class=\"wp-block-ponyo-table-row\">\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Execution Environment<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Direct host execution as the user<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Isolated microVM with its own kernel<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<\/tr>\n<tr class=\"wp-block-ponyo-table-row\">\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Filesystem View<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Full host filesystem, including <code>~\/<\/code><\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Workspace mount only<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<\/tr>\n<tr class=\"wp-block-ponyo-table-row\">\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Credential Access<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>All credentials in user\u2019s home dir<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Credential paths blocked by default<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<\/tr>\n<tr class=\"wp-block-ponyo-table-row\">\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Destructive Command Impact<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Permanent host damage<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Throwaway sandbox<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<\/tr>\n<tr class=\"wp-block-ponyo-table-row\">\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Review Before Merge<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>None<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Git worktree isolation with <code>sbx exec &lt;sandbox-name&gt; git diff main<\/code><\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<\/tr>\n<tr class=\"wp-block-ponyo-table-row\">\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Recovery<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span>Often impossible (TRIM zeroes blocks)<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<td class=\"wp-block-ponyo-cell\">\n                    <span class=\"responsive-table-label\"><\/span>\n<p>                    <span class=\"responsive-table-value\"><br \/>\n                                                    <span class=\"responsive-table-value-content\"><\/span><\/span><\/p>\n<p><span><code>sbx rm<\/code> and start fresh<\/span><\/p>\n<p>                    <br \/>\n                                            \n            <\/p><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h2 class=\"wp-block-heading\">Best Practices for Secure AI Coding Agent Deployment<\/h2>\n<ol class=\"wp-block-list\">\n<li><strong>Stop running coding agents directly on your host.<\/strong> Containerization or microVM isolation should be the default, not an advanced option.<\/li>\n<li><strong>Use <code>sbx run<\/code> for every coding task that involves filesystem operations.<\/strong> Especially \u201cclean up,\u201d \u201corganize,\u201d \u201crefactor,\u201d and \u201cdelete unused\u201d prompts. These are the prompt categories most likely to produce a destructive <code>rm -rf<\/code>.<\/li>\n<li><strong>Use Git worktrees for destructive operations.<\/strong> <code>sbx run --name &lt;name&gt; --branch=&lt;branch&gt; claude<\/code> ensures the agent\u2019s changes are reviewable before they touch your main branch.<\/li>\n<li><strong>Never use <code>--dangerously-skip-permissions<\/code> on the host machine.<\/strong> If you need the agent to run commands without per-command approval, run it inside a sandbox. The sandbox boundary is what makes \u201cskip permissions\u201d safe.<\/li>\n<li><strong>Treat the sandbox as throwaway.<\/strong> Don\u2019t store anything important inside it. The whole point is that you can <code>sbx rm<\/code> and start fresh.<\/li>\n<li><strong>Audit the policy log.<\/strong> <code>sbx policy log<\/code> shows every allowed and denied connection attempt, which becomes your forensics trail if something does go wrong.<\/li>\n<\/ol>\n<h2 class=\"wp-block-heading\">Take Action: Secure Your AI Coding Agent Today<\/h2>\n<p>The path to safe AI coding agent execution starts with one command. Here\u2019s how to move away from running agents on the host:<\/p>\n<ul class=\"wp-block-list\">\n<li><strong>Install Docker Sandboxes.<\/strong> Visit the<a href=\"https:\/\/docs.docker.com\/ai\/sandboxes\/\" rel=\"nofollow noopener\" target=\"_blank\"> Docker Sandboxes documentation<\/a> to install sbx and run your first sandboxed agent in under five minutes.<\/li>\n<li><strong>Try it with your existing workflow.<\/strong> <code>sbx run claude<\/code> (or <code>sbx run cursor<\/code>, <code>sbx run codex<\/code>, etc.) drops your existing agent into a microVM with no configuration changes required.<\/li>\n<li><strong>Read the architecture deep-dive.<\/strong> The<a href=\"https:\/\/docs.docker.com\/ai\/sandboxes\/architecture\/\" rel=\"nofollow noopener\" target=\"_blank\"> Docker Sandboxes architecture documentation<\/a> explains the microVM model, the workspace mounting, and the network policy layer.<\/li>\n<li><strong>Browse the MCP Catalog.<\/strong> If your agent uses MCP servers, the<a href=\"https:\/\/hub.docker.com\/mcp\" rel=\"nofollow noopener\" target=\"_blank\"> Docker MCP Catalog<\/a> provides containerized, verified servers that complement sandboxed agent execution.<\/li>\n<\/ul>\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n<p>The LovesWorkin incident, the Mike Wolak Ubuntu wipe, the Claude Cowork family-photos deletion, and the GitHub issue #12637 shell-glob expansion bug are all the same story. An AI coding agent reasoned its way through a task, generated a command that contained a destructive argument, and the shell executed it because there was nothing in the architecture to say \u201cthis command would destroy the developer\u2019s work.\u201d<\/p>\n<p>These aren\u2019t bugs in Claude Code, or Cursor, or Kiro, or any individual agent. They\u2019re properties of the execution model. As long as agents run on the host with the user\u2019s permissions, this category of failure will keep happening, with new variations each time.<\/p>\n<p>Docker Sandboxes doesn\u2019t try to make the agent smarter. It changes where the agent runs. The agent gets a workspace. It does not get your machine.<\/p>\n<p><em>Coming up in our series: Issue 3 will explore the AWS Cost Explorer outage, where Amazon\u2019s own Kiro agent decided to delete and rebuild a production environment in seconds, and what scoped-identity sandbox configuration prevents that class of failure.<\/em><\/p>\n<h3 class=\"wp-block-heading\">Learn More<\/h3>\n<ul class=\"wp-block-list\">\n<li><strong>Run agents safely with Docker Sandboxes:<\/strong><a href=\"https:\/\/docs.docker.com\/ai\/sandboxes\/\" rel=\"nofollow noopener\" target=\"_blank\"> Visit the Docker Sandboxes documentation<\/a> to get started with workspace-isolated agent execution in minutes.<\/li>\n<li><strong>Explore the Docker MCP Catalog:<\/strong><a href=\"https:\/\/hub.docker.com\/mcp\" rel=\"nofollow noopener\" target=\"_blank\"> Discover MCP servers<\/a> that connect your agents to external services through Docker\u2019s security-first architecture.<\/li>\n<li><strong>Download Docker Desktop:<\/strong><a href=\"https:\/\/www.docker.com\/products\/docker-desktop\/\"> The fastest path to a governed AI agent environment<\/a>, with Docker Sandboxes, MCP Gateway, and Model Runner in a single install.<\/li>\n<li><strong>Read the MCP Horror Stories series:<\/strong><a href=\"https:\/\/www.docker.com\/blog\/mcp-security-issues-threatening-ai-infrastructure\/\"> Start with issue 1<\/a> to understand the protocol-layer security risks that complement the agent-layer risks covered here.<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>This is Part 2 of our AI Coding Agent Horror Stories series, an in-depth look at real-world security incidents exposing [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4197,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[4],"tags":[],"class_list":["post-4196","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-docker"],"_links":{"self":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/4196","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/comments?post=4196"}],"version-history":[{"count":0,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/4196\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media\/4197"}],"wp:attachment":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media?parent=4196"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/categories?post=4196"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/tags?post=4196"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}