Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to restore a small code in user namespace #2597

Open
chamber909 opened this issue Feb 13, 2025 · 5 comments · May be fixed by #2600
Open

Not able to restore a small code in user namespace #2597

chamber909 opened this issue Feb 13, 2025 · 5 comments · May be fixed by #2600

Comments

@chamber909
Copy link

chamber909 commented Feb 13, 2025

Code i wrote:

[chamber@cognitive ~]$ cat main.c 
#include <stdio.h>
#include <unistd.h>

int main() {
    printf("Process started in user namespace (PID: %d)\n", getpid());
    while (1) {
        printf("Running...\n");
        sleep(1);
    }
    return 0;
}

Ran in new user namespace:

[chamber@cognitive ~]$ unshare --user --map-root-user ./exec
Process started in user namespace (PID: 43804)
Running...
Running...
Running...
Running...
Running...
Running...
Running...

for dumping :

sudo ./criu/scripts/criu-ns dump -t 43804 -D checkpoint/ -j -v4

dump log :

(00.012483) 
(00.012484) Dumping pstree (pid: 43804)
(00.012485) ----------------------------------------
(00.012486) Process: 43804(43804)
(00.012498) ----------------------------------------
(00.012504) Dumping 43804(43804)'s namespaces
(00.012505) Namespaces dump complete
(00.012520) cg: Dumping 1 sets
(00.012522) cg:    `- Dumping  of /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-472df527-db2e-428c-b76d-068cd9bef18e.scope
(00.012524) cg: Writing CG image
(00.012537) unix: Dumping external sockets
(00.012576) tty: Unpaired slave 1
(00.012579) Writing image inventory (version 1)
(00.012623) Running post-dump scripts
(00.012625) Unfreezing tasks into 2
(00.012626) 	Unseizing 43804 into 2
(00.012740) Writing stats
(00.012758) Dumping finished successfully

for restore :

sudo ./criu/scripts/criu-ns restore -D checkpoint/ -v4

restore log:

['./criu/scripts/criu-ns', 'restore', '-D', 'checkpoint/', '-v4']
(00.000002) Version: 3.16.1 (gitid 0)
(00.000011) Running on cognitive.local Linux 6.8.0-52-generic #53~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jan 15 19:18:46 UTC 2 x86_64
(00.000018) Loaded kdat cache from /run/criu.kdat
(00.000025) rlimit: RLIMIT_NOFILE unlimited for self
(00.000080) cpu: x86_family 6 x86_vendor_id GenuineIntel x86_model_id 11th Gen Intel(R) Core(TM) i3-1125G4 @ 2.00GHz
(00.000083) cpu: fpu: xfeatures_mask 0x2e5 xsave_size 2696 xsave_size_max 2696 xsaves_size 2456
(00.000088) cpu: fpu: x87 floating point registers     xstate_offsets      0 / 0      xstate_sizes    160 / 160   
(00.000090) cpu: fpu: AVX registers                    xstate_offsets    576 / 576    xstate_sizes    256 / 256   
(00.000093) cpu: fpu: AVX-512 opmask                   xstate_offsets   1088 / 832    xstate_sizes     64 / 64    
(00.000094) cpu: fpu: AVX-512 Hi256                    xstate_offsets   1152 / 896    xstate_sizes    512 / 512   
(00.000097) cpu: fpu: AVX-512 ZMM_Hi256                xstate_offsets   1664 / 1408   xstate_sizes   1024 / 1024  
(00.000099) cpu: fpu: Protection Keys User registers   xstate_offsets   2688 / 2432   xstate_sizes      8 / 8     
(00.000102) cpu: fpu:1 fxsr:1 xsave:1 xsaveopt:1 xsavec:1 xgetbv1:1 xsaves:1
(00.000126) kernel pid_max=4194304
(00.000128) Reading image tree
(00.000142) Add mnt ns 6 pid 43804
(00.000145) Add net ns 2 pid 43804
(00.000146) Add pid ns 1 pid 43804
(00.000148) pstree pid_max=43804
(00.000151) Will restore in 10000000 namespaces
(00.000153) NS mask to use 10000000
(00.000170) Collecting 51/56 (flags 3)
(00.000173) No memfd.img image
(00.000175)  `- ... done
(00.000176) Collecting 40/54 (flags 2)
(00.000182) Collected [home/chamber/exec] ID 0x1
(00.000185) Collected [usr/lib/x86_64-linux-gnu/libc.so.6] ID 0x2
(00.000188) Collected [usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2] ID 0x3
(00.000190) Collected [dev/pts/1] ID 0x5
(00.000194) Collected [home/chamber] ID 0x6
(00.000196) Collected [.] ID 0x7
(00.000198)  `- ... done
(00.000200) Collecting 46/68 (flags 0)
(00.000202) No remap-fpath.img image
(00.000204)  `- ... done
(00.000207) No apparmor.img image
(00.000217) cg: Preparing cgroups yard (cgroups restore mode 0x4)
(00.000312) cg: Opening .criu.cgyard.uUTaL7 as cg yard
(00.000319) cg: 	Making controller dir .criu.cgyard.uUTaL7/unifie ()
(00.000337) cg: Determined cgroup dir unifie/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-472df527-db2e-428c-b76d-068cd9bef18e.scope already exist
(00.000339) cg: Skip restoring properties on cgroup dir unifie/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-472df527-db2e-428c-b76d-068cd9bef18e.scope
(00.000346) Running pre-restore scripts
(00.000520) No pidns-1.img image
(00.000557) uns: Daemon started
(00.000581) Forking task with 43804 pid (flags 0x10000000)
(00.000583) Creating process using clone3()
(00.000714) PID: real 43804 virt 43804
(00.000776) Wait until namespaces are created
(00.000883)  43804: timens: monotonic -1145 939127350
(00.000892)  43804: timens: boottime -1145 939117920
(00.000933) Running setup-namespaces scripts
(00.000954)  43804: cg: Move into 2
(00.000957)  43804: cg:   `-> unifie//user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-472df527-db2e-428c-b76d-068cd9bef18e.scope/cgroup.procs
(00.000961)  43804: uns: calling userns_move (-1, 0)
(00.001004) uns: daemon calls 0x5925d3ed97e0 (43804, -1, 0)
(00.010497)  43804: Calling restore_sid() for init
(00.010540)  43804: Error (criu/util.c:1392): Unable to open the proc file system: Operation not permitted
(00.010616) uns: calling exit_usernsd (-1, 1)
(00.010636) uns: daemon calls 0x5925d3f14b00 (1, -1, 1)
(00.010644) uns: `- daemon exits w/ 0
(00.011349) Error (criu/cr-restore.c:1480): 43804 killed by signal 9: Killed
(00.011362) uns: daemon stopped
(00.011365) Error (criu/cr-restore.c:2447): Restoring FAILED.

why its showing restoring failed???

@adrianreber
Copy link
Member

What OS do you use. We have seen problems with Ubuntu 24.04 because of

# Ubuntu has set up AppArmor in 24.04 so that it blocks use of user
# namespaces by unprivileged users. We need this for some of our tests.
sysctl kernel.apparmor_restrict_unprivileged_userns=0 || :

Not sure that this related. Probably not. But worth a try.

Also, is that CRIU from Ubuntu? The 3.16 that Ubuntu shipped was broken and they refused to fix it. Try with the latest version that we provide (either PPA or OBS, see the wiki for a way to get binaries).

Why are you using criu-ns? What are you trying to do?

@chamber909
Copy link
Author

[chamber@cognitive codes]$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

this criu is the clone of thise repo latest one and i am trying make a simple program which will output only 'running'

[chamber@cognitive ~]$ cat main.c 
#include <stdio.h>
#include <unistd.h>

int main() {
    printf("Process started in user namespace (PID: %d)\n", getpid());
    while (1) {
        printf("Running...\n");
        sleep(1);
    }
    return 0;
}

then make it run in a different user namespace then dump that process (main.c compiled executable) then try to restore it which creates a problem

and the reason i am using criu-ns because of this docs https://www.criu.org/CR_in_namespace

if i use the regular ones( ./criu/criu/criu) then also it gets the same output

@avagin
Copy link
Member

avagin commented Feb 14, 2025

(00.010540) 43804: Error (criu/util.c:1392): Unable to open the proc file system: Operation not permitted

You are dumping a process in a separate userns but it is in the current mount namespace. I think we never consider that case.

CRIU is trying to map the proc file system from the restored user ns and it fails because it doesn't have the required capabilities in the host user namespace where the mount namespace belongs to. First, we need to check why it is trying to mount the proc file system. If it is really required, we need to mount it from the host userns. If I remember right, the usernsd that can be used for that.

@ToolmanP
Copy link

ToolmanP commented Feb 16, 2025

@avagin Hi, i'm considering participating in GSOC 2025 for CRIU project, so i'm currently learning this codebase and happened to stumple upon this issue and working to solve this. So here's some insights on it. and I'm planning to fix this issue.

(00.010540) 43804: Error (criu/util.c:1392): Unable to open the proc file system: Operation not permitted

You are dumping a process in a separate userns but it is in the current mount namespace. I think we never consider that case.

Somehow, the cflags which is the clone flags is propagated into the fork_with_pid and a new userspace is created while keeping the same mount userspace as the host.

(00.000151) Will restore in 10000000 namespaces

So i guess it does not have the privilege to do the proc mount.

CRIU is trying to map the proc file system from the restored user ns and it fails because it doesn't have the required capabilities in the host user namespace where the mount namespace belongs to. First, we need to check why it is trying to mount the proc file system. If it is really required, we need to mount it from the host userns. If I remember right, the usernsd that can be used for that.

Yes, i believe usernsd can do this job because it's on the host side. We can leverage that to perform mount_proc for the forked child and restoring job should be then working but i'm not sure whether other privileged calls might be affected by this corner case. I tried to perform a trivial fix by clearing the NEWUSER flag in the clone flags but still it seems other actions are also rejected by the kernel. Are there more stuffs or privileged calls that i may look into? I don't think this issue is trivial to fix so i'm planning to do some more research on the code base also for my potential GSOC participation.

@ToolmanP
Copy link

@chamber909 @avagin should be fixed in #2600. Please check if there is something that i might overlook. Thanks :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants