To avoid symlink loops to /host in nested chrootenvs we need to remove
one level of indirection. This is also what's generally expected of
/host contents.
* Remove unused argument from pivot_root;
* Factor out tmpdir creation into a separate function;
* Remove unused fstype from bind mount;
* Use unlink instead of a treewalk to remove empty temporary directory.
The problem with stacking chrootenv before was that CLONE_NEWUSER cannot
be used when a child uses chroot. So instead of that we use pivot_root
which replaces root in the whole namespace. This requires our new root
to be an actual fs so we mount tmpfs.
glibc 2.27 (and possibly other versions) can't handle an `nopenfd` value larger than 2^19 in `ntfw`, which is problematic if you've set the maximum number of fds per process to a value higher than that.
Changes:
* doesn't handle root user separately
* doesn't chdir("/") which makes using it seamless
* only bind mounts, doesn't symlink (i.e. files)
Incidentally, fixes#33106.
It's about two times shorter than the previous version, and much
easier to read/follow through. It uses GLib quite heavily, along with
RAII (available in GCC/Clang).
* Wrap LEN macro in parantheses
* Drop env_filter in favor of stateful environ_blacklist_filter,
use execvp instead of execvpe, don't explicitly use environ
* Add argument error logging wherever it makes sense
* Drop strjoin in favor of asprintf
* char* -> const char* where appropriate
* Handle stat errors
* Print user messages with fputs, not errorf
* Abstract away is_str_in (previously bind_blacklisted)
* Cleanup temporary directory on error
* Some minor syntactic and naming changes
Thanks to Jörg Thalheim and Tuomas Tynkkynen for the code review!
The `DISPLAY` environment variable is propagated into chroots built with
`buildFHSUserEnv`, but currently the `XAUTHORITY` variable is not. When
the latter is set, its value is usually necessary in order to connect to
the X server identified by the former.
This matters for users running gdm3, for example, who have `XAUTHORITY`
set to something like `/run/user/1000/gdm/Xauthority` instead of the X
default of `~/.Xauthority`, which doesn't exist in that setup.
Fixes#21532.
This takes another approach at binding FHS directory structure. We
now bind-mount all the root filesystem to directory "/host" in the target tree.
From that we symlink all the directories into the tree if they do not already
exist in FHS structure.
This probably makes `CHROOTENV_EXTRA_BINDS` unnecessary -- its main usecase was
to add bound directories from the host to the sandbox, and we not just symlink
all of them. I plan to get some feedback on its usage and maybe deprecate it.
This also drops old `buildFHSChrootEnv` infrastructure. The main problem with it
is it's very difficult to unmount a recursive-bound directory when mount is not
sandboxed. This problem is a bug even without these changes -- if
you have for example `/home/alice` mounted to somewhere, you wouldn't see
it in `buildFHSChrootEnv` now. With the new directory structure, it's
impossible to use regular bind at all. After some tackling with this I realized
that the fix would be brittle and dangerous (if you don't unmount everything
clearly and proceed to removing the temporary directory, bye-bye fs!). It also
probably doesn't worth it because I haven't heard that someone actually uses it
for a long time, and `buildFHSUserEnv` should cover most cases while being much
more maintainable and safe for the end-user.
Login mode can cause hidden problems, e.g. #12406. Generally we don't want
to read user's .bash_profile when we don't start an interactive shell inside
a chroot.
Previously is was assumed that bash was in the path when calling the
environment setup script. This changes all of the references of bash to
be absolute paths so that the user doesn't have to worry about the
environment they call it with.