A Solution to Fixing Containers when lxcfs Crashes #583

deleriux · 2023-01-18T14:33:23Z

Hello all,

I briefly mentioned last week that I had a solution Transport endpoint not connected errors in containers when lxcfs crashes without having to restart every container that came up.

I've uploaded the code I have as-is and here it is.
https://github.com/deleriux/lxcfs-reattach

I've tested on Ubuntu 22 and Ubuntu 16 (with updated kernel).

The way that this works is by utilizing various system calls introduced into the kernel post 5.2 that split up the mount process into multiple steps, see:

https://lwn.net/Articles/759499/

You can leverage this step-by-step approach to take the source path in the host namespace, then switch to the container namespace and mount the target path in the container namespace.

The algorithm basically is as follows.

Enter the mount namespace lxcfs is running in.
Locate the particular bind mount you are interested in fixing. IE /var/lib/lxcfs/proc/meminfo
Call open_tree() on the path to obtain a mount_fd representing this mount point.
Enter the containers mount namespace (you've now snatched the mount FD from the hosts namespace!)
Call unmount() on containers path to /proc/meminfo
Call move_mount() against /proc/meminfo to reattach this mountpoint to the containers VFS.

The code for this part is kept in https://github.com/deleriux/lxcfs-reattach/blob/main/container.c#L145 .

The remaining code is mostly dedicated to heuristics in finding containers to mount and mountpoints to monitor.
I'm pretty sure it littered with stupid bugs, but it works.

The process supports a monitor mode that uses epoll() against all discovered /proc/pid/mounts to watch mounts come and go. If a qualifying mountpoint is unmounted then remounted (such as if lxcfs gets restarted) the process detects it and issues a request to test then rebind mountpoints that no longer work.

If lxcfs crashes and is not restarted, then it cant help there, but as soon as a new instance comes up it should rebind the mountpoints pretty quickly.

My code doesn't / can't distinguish which lxcfs process to use when rebinding mountpoints, it merely selects the 'best/first' working one and runs with it. This is particularly prevalent in LXD in snaps which tends to run its own lxcfs along with the systems lxcfs which can also be running.

I'm not suggesting this is the best and only solution to this problem (or my code for that matter is suitable for this project in its current form) but the algorithm to fix running containers is pretty straightforwards and tends to work flawlessly without being too disruptive.

The text was updated successfully, but these errors were encountered:

mihalicyn · 2023-01-18T15:45:40Z

Hi @deleriux

that's a good idea, as I said before we are currently working on internal lxcfs mechanism to recover from crashes. But that's a good solution for some cases if rebooting all containers is problematic.

stgraber · 2023-01-18T18:29:35Z

We'll need to be very very careful when doing something like that as root in the container can mess with the mount namespace.
So we may be tricked into traversing some symlinks, get locked up by hitting an intentionally broken FUSE mount, ...

That's the reason we never invested too much effort into injecting LXCFS mounts into an existing instance.
Even prior to the new mount API, we had a workaround using mount propagation to add/remove mounts from containers, but that still had the same security concerns attached to it.

I certainly feel a lot better about the current plan from @mihalicyn to allow recovering from a lxcfs crash by re-attaching to the existing FUSE mounts.

stgraber · 2023-09-29T15:57:16Z

@mihalicyn we can re-use this one to track the FUSE re-attach work

zhoushuke · 2023-10-10T08:55:31Z

@mihalicyn we can re-use this one to track the FUSE re-attach work

@stgraber so, would this fix be added to next release?

mihalicyn · 2023-10-10T09:05:31Z

@mihalicyn we can re-use this one to track the FUSE re-attach work

@stgraber so, would this fix be added to next release?

It won't be addressed in the next release I would say. We need to make some changes in the Linux kernel as a part of this work. But it will be definitely implemented in LXCFS.

Do you have any issues with LXCFS right now?

zhoushuke · 2024-04-19T09:04:24Z

@mihalicyn any update?

stgraber added the Feature New feature, not a bug label Sep 29, 2023

stgraber assigned mihalicyn Sep 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Solution to Fixing Containers when lxcfs Crashes #583

A Solution to Fixing Containers when lxcfs Crashes #583

deleriux commented Jan 18, 2023 •

edited

Loading

mihalicyn commented Jan 18, 2023

stgraber commented Jan 18, 2023

stgraber commented Sep 29, 2023

zhoushuke commented Oct 10, 2023

mihalicyn commented Oct 10, 2023

zhoushuke commented Apr 19, 2024

A Solution to Fixing Containers when lxcfs Crashes #583

A Solution to Fixing Containers when lxcfs Crashes #583

Comments

deleriux commented Jan 18, 2023 • edited Loading

mihalicyn commented Jan 18, 2023

stgraber commented Jan 18, 2023

stgraber commented Sep 29, 2023

zhoushuke commented Oct 10, 2023

mihalicyn commented Oct 10, 2023

zhoushuke commented Apr 19, 2024

deleriux commented Jan 18, 2023 •

edited

Loading