Artifact 217af228c38f2e06f6300a60c2bb5b04b4b2571e460095c73e9c2304b5c10738:
D 2025-07-01T02:27:40.494 L User-Based\sPrivilege\sSeparation N text/x-markdown U tangent W 4948 ## Motivation There is an ancient practice in the Unix world where each service gets its own "user." The practice is so old that when it was a new idea, these "system users" got intermixed with the real human kind, and you'd end up with different system user IDs on each box, depending on the order the users were created. Eventually, OSes began reserving some number of the low-numbered IDs (typically 500 or 1000) for themselves, starting the real human users past that limit. Podman obsoletes all of this. Do not combine the two. Why? Read on. ## User Namespaces "Linux container" is a wrapper term for a bunch of disconnected technologies which tools like Podman combine into a useful whole. It is useful to think of this assortment of underlying features as if there had been a concerted effort to add containerization to the Linux kernel, but the fact is that the pieces were added separately over a span of years, in some cases for purposes quite separate from what we now think of as containerization. I bring this up because it can be important to understand the elements, as in this specific case, where Linux's [namespaces] feature functionally obsoletes the old "system users" practice. In brief, user namespaces — userns for short — provide the same benefit: isolating privilege based on user ID. Podman takes that further with the concept of [subordinate UIDs][subuid] and [GIDs][subgid]. ## Default Behavior Consider this: ``` shell $ id=$(podman run --rm -d alpine sleep 60) $ podman top $id user huser USER HUSER root 501 ``` The first command merely starts a dummy container for us to examine, which will disappear a minute after we started it. The second then tells Podman to report the user IDs involved, which shows this rootless container running under my host-side user ID as 501(^That's the first regular user ID under macOS, showing the split between system users and human users discussed at the top of this article.) even as it appears to be running as `root` inside. Yet already we have most of the protection afforded by the ancient "system users" concept because of the assortment of technologies brought to bear by Podman under the label "containerization". This `sleep 60` container cannot… ### …access my home directory To allow that, we would have had to pass something like `--volume $HOME:/home/host:Z --workdir /home/host`, as tools like [Distrobox] go out of their way to do, on purpose. ### …send signals to my processes Unless you tell it otherwise, Podman puts each container into a separate [pidns], which you can see with: ``` shell $ podman run --rm -it alpine ps -eaf PID USER TIME COMMAND 1 root 0:00 ps -eaf ``` We're running as a fake "root" user in this instance, and we gave `ps` the "show me everything" flags, yet the only process we see is the one for `ps` itself. Also note that it appears to be PID 1, whereas the real PID 1 on my host is `/sbin/launchd`, this being a Mac. ### …communicate with my background processes This one isn't hard-and-fast. Only *some* of the paths are blocked off by rootless Podman's default configuration: * **old-school System V IPC** is blocked off by having each container run in a separate [ipcns] by default * **Unix domain sockets** appear in the filesystem, so the prior point applies: if you don't map it through with `--volume`, the container can't see it * **localhost sockets** are blocked off by the default `--network=pasta` for rootless containers; you have to give `--network=host` to override that There are two other major ways Linux background processes may allow IPC, however: * **listening on 0.0.0.0** allows use of the `host.containers.internal` entry that Podman puts in `/etc/hosts` to give the container access to the host's public IP(^…which might not exist, as with the `podman machine` case.) * **[abstract sockets][abssock]** bypass the filesystem namespace by not the network namespace, so they may be visible or not, depending on how you set up your container(^This is a particular worry with containers since old versions of `containerd` used an abstract socket, as does DBus to this day. This can allow powerful effects which are off-topic for this article, so let me simply say that you should avoid use of `--network=host` if a primary goal of your use of containerization is improved security.) [abssock]: https://www.man7.org/linux/man-pages/man7/unix.7.html [Distrobox]: https://distrobox.it/ [ipcns]: https://www.man7.org/linux/man-pages/man7/ipc_namespaces.7.html [namespaces]: https://www.man7.org/linux/man-pages/man7/namespaces.7.html [pidns]: https://www.man7.org/linux/man-pages/man7/pid_namespaces.7.html [subgid]: https://www.man7.org/linux/man-pages/man5/subgid.5.html [subuid]: https://www.man7.org/linux/man-pages/man5/subuid.5.html Z 3db9e9d31b0c53d922683b1f4852324a