Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rocky Linux images resulting in PAM sudo error #56

Open
geerlingguy opened this issue Jan 13, 2025 · 9 comments
Open

Rocky Linux images resulting in PAM sudo error #56

geerlingguy opened this issue Jan 13, 2025 · 9 comments
Assignees
Labels
type: bug Something isn't working

Comments

@geerlingguy
Copy link

When I'm running my geerlingguy/docker-rockylinux9-ansible containers in CI in GitHub Actions to test my Ansible projects, I have been seeing the following errors whenever running a task with sudo/`become:

  TASK [Gathering Facts] *********************************************************
  fatal: [instance]: FAILED! => {"ansible_facts": {}, "changed": false, "failed_modules": {"ansible.legacy.setup": {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python3.9"}, "failed": true, "module_stderr": "sudo: PAM account management error: Authentication service cannot retrieve authentication info\nsudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE: No start of json char found\nSee stdout/stderr for the exact error", "rc": 1}}, "msg": "The following modules failed to execute: ansible.legacy.setup\n"}

Other users have reported the same, for both Rocky Linux 8 and 9, for the past few weeks. For example: geerlingguy/docker-rockylinux9-ansible#6

[root@25c3908841c3 /]# sudo "hello world"
sudo: PAM account management error: Authentication service cannot retrieve authentication info
sudo: a password is required

This error is not reproducible on a Mac running Docker Desktop, but it is in instances running docker-ce or on GitHub Actions. We use sudo in the container because it is testing/verifying playbooks that are run against instances where sudo may be required.

In the past this was never an issue; it seems like it could be also related to the yum install sudo command that I run that updates PAM (perhaps?) in my Ansible/Docker project: https://github.com/geerlingguy/docker-rockylinux9-ansible/blob/master/Dockerfile#L22

Is there something that's changed in Rocky Linux lately that could be causing this?

@NeilHanlon
Copy link
Member

Heya Jeff - Thanks for the detailed report.

I'll check this out -- nothing springs to mind but it's totally possible something has changed due to our use of kiwi to build the container root filesystems since 9.4.

@NeilHanlon NeilHanlon self-assigned this Jan 13, 2025
@NeilHanlon NeilHanlon added the type: bug Something isn't working label Jan 13, 2025
@NeilHanlon
Copy link
Member

Thanks for your patience. We had an outage with our powerpc cluster yesterday I had to work on.

The error here indicates that pam isn't able to resolve the user running the container to anything in its database(s). Would it be possible to get the contents of /etc/shadow and /etc/passwd on an affected instance, as well as the output of id -u?

@artis3n
Copy link

artis3n commented Jan 15, 2025

This has been happening for a few weeks to a couple of months but...I can no longer reproduce this with any of the test cases I put into geerlingguy's repo (geerlingguy/docker-rockylinux9-ansible#6 (comment)) besides a GitHub Actions runner. Going to go pull out that troubleshooting info you asked for, but wanted to add that new nuance.

Trying on a Ubuntu 22.04 cloud instance, leaving notes:

Don't see PAM errors in the rocky linux container now. I see docker-ce released a new minor version 2 days ago (27.5.0). But pinning back to the older docker-ce (5:27.4.1-1~ubuntu.24.04~noble and 5:27.4.0-1~ubuntu.24.04~noble) doesn't result in the error anymore either. The rockylinux image hasn't been updated since its last tag 2 months ago. I don't see new versions for pam, sudo, or the other dependencies (geerlingguy/docker-rockylinux9-ansible#6 (comment)) previously listed, unless the version numbers were overwritten with new changes.

@artis3n
Copy link

artis3n commented Jan 15, 2025

ansible/molecule#4365 is reporting this same error running on GitHub Actions with registry.access.redhat.com/ubi9/ubi-init:latest

@artis3n
Copy link

artis3n commented Jan 15, 2025

Looks like the Action runners use Docker-CE 26.x. My suspicion is this is from a kernel/syscall error on the Docker end.

(https://github.com/artis3n/ansible-role-tailscale/actions/runs/12797444601/job/35679327742?pr=532#step:7:15)

Client: Docker Engine - Community
 Version:           26.1.3
 API version:       1.45
 Go version:        go1.21.10
 Git commit:        b72abbb
 Built:             Thu May 16 08:33:35 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.1.3
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.10
  Git commit:       8e96db1
  Built:            Thu May 16 08:33:35 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.24
  GitCommit:        88bf19b2105c8b17560993bee28a01ddc2f97182
 runc:
  Version:          1.2.2
  GitCommit:        v1.2.2-0-g7cb3632
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Would it be possible to get the contents of /etc/shadow and /etc/passwd on an affected instance, as well as the output of id -u?

(https://github.com/artis3n/ansible-role-tailscale/actions/runs/12797497586/job/35679504452?pr=532#step:6:207)

id -u
cat /etc/shadow
cat /etc/passwd
0


root:!locked::0:99999:7:::
bin:*:19469:0:99999:7:::
daemon:*:19469:0:99999:7:::
adm:*:19469:0:99999:7:::
lp:*:19469:0:99999:7:::
sync:*:19469:0:99999:7:::
shutdown:*:19469:0:99999:7:::
halt:*:19469:0:99999:7:::
mail:*:19469:0:99999:7:::
operator:*:19469:0:99999:7:::
games:*:19469:0:99999:7:::
ftp:*:19469:0:99999:7:::
nobody:*:19469:0:99999:7:::
tss:!!:19680::::::
systemd-coredump:!!:20101::::::
dbus:!!:20101::::::


root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
tss:x:59:59:Account used for TPM access:/:/usr/sbin/nologin
systemd-coredump:x:999:997:systemd Core Dumper:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin

@RanabirChakraborty
Copy link

RanabirChakraborty commented Jan 16, 2025

Although I'm not sure why but we are seeing the same issue even when running the GitHub Action using Podman instead of Docker.

@NeilHanlon
Copy link
Member

Heya folks - Have not forgotten about this but looks like it's just moving into other areas and doesn't feel deterministic.

For example, in the OpenStack-Ansible project, we've been having failures like this due to AppArmor when running CentOS Stream 9 containers on Ubuntu hosts, but not Rocky.

Has anyone seen any root cause analysis on this yet? I'm struggling to see common threads to look down.

@geerlingguy
Copy link
Author

I have not seen anything more, unfortunately :(

I haven't had time to dig any deeper.

@artis3n
Copy link

artis3n commented Feb 22, 2025

It looks like /etc/shadow is being created without root read permissions.

(ansible task)

    - name: Debugging
      changed_when: false
      register: thing
      ansible.builtin.shell:
        cmd: |
          echo ""
          ls -l /etc/shadow
          ls -l /etc/passwd
          echo ""

    - name: Print debug
      ansible.builtin.debug:
        var: thing.stdout

Image

@andtra realized that in geerlingguy/docker-rockylinux9-ansible#6 (comment)

I can reproduce that as the source issue as well

FWIW this also seems to affect RHEL and Oracle Linux, but not AlmaLinux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants