Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run Docker container as unprivileged user, allow PUID/PGID selection #722

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

lkubb
Copy link

@lkubb lkubb commented Jun 25, 2022

Currently, the container process runs with root privileges. This cannot be changed, even by specifying PUID/PGID (as suggested in docker-compose.yml).

This PR migrates to running as an unprivileged user and makes it possible to specify PUID/PGID environment variables to choose UID/GID. Migration of existing data owned by root is handled transparently.

A side-effect is that the /app directory is read-only to the python process, I'm not sure if that is a problem since I'm not familiar with the architecture of changedetection.

I also took the liberty to clean up the Dockerfile (and apt cache) a bit, hope that's not a problem. :) Please feel free to suggest changes or deny the PR, if it does not fit your vision.

Fixes #565.

@lkubb lkubb marked this pull request as ready for review June 25, 2022 13:28

set -eu

# If the first argument looks like a flag, assume we want to run changedetection
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm what is the problem you're trying to solve here?

Copy link
Author

@lkubb lkubb Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you're referring to the line below: If the container is e.g. run with docker-compose run changedetection -c, this would run python ./changedetection.py -d /datastore -c inside the container. It's mostly a convention, nothing that's necessary, just nice to have. If you prefer it removed, no worries. :)

docker-entrypoint.sh Outdated Show resolved Hide resolved
Copy link
Owner

@dgtlmoon dgtlmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see PR comments - mainly about being sure it's actually /datastore ?

Dockerfile Show resolved Hide resolved
Dockerfile Outdated
libxslt-dev \
zlib1g-dev && \
rm -rf /var/lib/apt/lists/*; \
useradd -u 911 -U -d /datastore -s /bin/false changedetection && \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 911 ? :) Is it better to choose a UID > 1000 ?

Copy link
Author

@lkubb lkubb Jul 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the standard one used by linuxserver.io and makes it a system user, but a higher default one like 3000 is fine as well imho. Would you prefer the default changed? It can be chosen freely anyways (docker-entrypoint.sh:L16-20).

docker-entrypoint.sh Outdated Show resolved Hide resolved
@dgtlmoon
Copy link
Owner

master now tests the application inside the fully built docker container

also need to test this on an existing install which used root.root

Dockerfile Outdated Show resolved Hide resolved
Copy link
Owner

@dgtlmoon dgtlmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure

@dgtlmoon
Copy link
Owner

@lkubb biggest one so far, gosu not found

@dgtlmoon dgtlmoon self-requested a review December 22, 2022 10:32
@lkubb lkubb force-pushed the docker-unprivileged branch 2 times, most recently from 0f6a310 to e26de1f Compare December 22, 2022 10:55
@dgtlmoon
Copy link
Owner

hmm when it's running as test, it needs to use /datastore but only when run in github/automated, because often people test locally when developing too

@dgtlmoon
Copy link
Owner

datastore_path = "./test-datastore"

somehow needs a var passed to pytest like pytest -d /datastore or.. unsure

@lkubb
Copy link
Author

lkubb commented Dec 22, 2022

a) Introduce a new env var DATASTORE_TESTING, which defaults to test-datastore, but override it in the github workflow
b) Just mkdir the test directory in the docker build script. Not very clean but it shouldn't hurt.

@dgtlmoon
Copy link
Owner

b) Just mkdir the test directory in the docker build script. Not very clean but it shouldn't hurt.

This is fine if theres a comment above like
# creating test-dir for pytest to run in

@lkubb
Copy link
Author

lkubb commented Dec 22, 2022

I added mkdir earlier, but you will need to approve the workflow.

Dockerfile Outdated Show resolved Hide resolved
@dgtlmoon
Copy link
Owner

dgtlmoon commented Dec 22, 2022

needs /app/changedetectionio/.pytest_cache

but why not just chown all the /app to the same UID/GUID that we run the container as?

@lkubb
Copy link
Author

lkubb commented Dec 23, 2022

Because usually, there is no good reason for an application to be able to modify itself. We could of course disable all that with a build arg specifically for the workflow, but I would be very hesitant to chown the app folder for production.

@dgtlmoon
Copy link
Owner

haha I love tests, what looked like a simple PR always shows up something

FAILED tests/test_backup.py::test_backup - PermissionError: [Errno 13] Permis...

so /datastore directory isnt writable (it creates the backup ZIP there)

@lkubb
Copy link
Author

lkubb commented Dec 23, 2022

so /datastore directory isnt writable (it creates the backup ZIP there)

From what I can tell, it creates it as /app/changedetectionio/download.zip, right?

============================= test session starts ==============================
platform linux -- Python 3.10.9, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /app/changedetectionio, configfile: pytest.ini
plugins: flask-1.2.0
collected 1 item

tests/test_backup.py::test_backup 
[...]
>       with open("download.zip", 'wb') as f:
E       PermissionError: [Errno 13] Permission denied: 'download.zip'

@dgtlmoon
Copy link
Owner

Oops right you are, that test needs fixing so its using the test-datastore directory

@dgtlmoon
Copy link
Owner

Or better is that it doesnt even write the file to disk and just stores it memory

@dgtlmoon dgtlmoon self-requested a review December 23, 2022 09:45
@lkubb
Copy link
Author

lkubb commented Nov 8, 2024

@duketuxem

This PR looked very promising—any chance we might see it merged soon?

Honestly, I got tired of explaining at some point and decided to run the container in Podman rootless with userns remapping.

I rebased on master once again, let's see if the tests still work (@dgtlmoon needs to approve them though). It has been a while since I created this PR, hope I didn't break anything during the rebase.

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

thanks - running again and will test manually

@lkubb
Copy link
Author

lkubb commented Nov 8, 2024

It seems another test tries to write to the app directory, need to have a look

@dgtlmoon It's this fixture:

@pytest.fixture(scope='function')
def measure_memory_usage(request):
    memory_usage = {"peak": 0, "stop": False}
    tracker_thread = Thread(target=track_memory, args=(memory_usage,))
    tracker_thread.start()

    yield

    memory_usage["stop"] = True
    tracker_thread.join()

    # Note: ru_maxrss is in kilobytes on Unix-based systems
    max_memory_used = memory_usage["peak"] / 1024  # Convert to MB
    s = f"Peak memory used by the test {request.node.fspath} - '{request.node.name}': {max_memory_used:.2f} MB"
    logger.debug(s)

    with open("test-memory.log", 'a') as f:
        f.write(f"{s}\n")

I can't find a reference to test-memory.log in the whole repository, is this still in use? Can it be replaced with a log message?

Edit: If you prefer to leave it as-is, we always the option of setting KEEP_PRIVILEGES=1 in the test suite, but it might mean that some issues are missed.

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

to be honest the whole memory test thing is not needed.. you can remove it, or on another PR, it didnt really add anything interesting :)

@lkubb
Copy link
Author

lkubb commented Nov 8, 2024

Alright, I chose to just remove writing the file since the rest of the code does not interfere.

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

agreed, no problems, some things seem like a nice idea - but just get in the way

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

sometimes there are "random" fails, i'll keep an eye on it

@lkubb
Copy link
Author

lkubb commented Nov 8, 2024

sometimes there are "random" fails, i'll keep an eye on it

👍 It's still complaining that /app/changedetectionio/.pytest_cache is not writable, might need to account for that somehow, maybe create the directory in the entrypoint if it detects we're running pytest.

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

@lkubb
Copy link
Author

lkubb commented Nov 8, 2024

The SMTP tests were failing because aiosmtp could not be installed since the changedetection user does not have a home directory. I added one, the SMTP container should start fine now.

@lkubb
Copy link
Author

lkubb commented Nov 8, 2024

Alright, all tests are passing now. For most setups, the entrypoint script should automatically chown the datastore to the correct user. The one exception is in case it is a network share mount, where it might fail. In this case, admins must ensure the correct PUID/PGID are set in the container so the script does not attempt to chown any files.

In case this logic causes any problems, it can be skipped completely by setting the KEEP_PRIVILEGES environment variable to a non-empty value. This results in the previous behavior.

I'd really like to wrap this PR up soon since it has been going for nearly two and a half years. :)

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

amazing thanks, will have a play with it - yeah its one of those "might break something at some time for somewhere" commits but it NEEDS to be in the codebase

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

can you think of some test cases to try? like i already have a datastore owned by some user and...

@lkubb
Copy link
Author

lkubb commented Nov 8, 2024

Unless otherwise noted, everything assumes 0755/0644 dir/file permissions (or 0 for others, just not 7/6).

First, I would verify everything works as before when KEEP_PRIVILEGES is set to provide an immediate bandaid to any problems.

Then, with a root-owned (populated) local datastore:

  1. Starting the container changes the ownership to 911:911 and everything works fine.
  2. Starting the container with PUID=1000 changes the ownership to 1000:911
  3. Starting the container with PGID=1000 changes the ownership to 911:1000
  4. Starting the container with PGID=1000 PGID=1000 changes the ownership to 1000:1000

Then chown -R the datastore to 1234:5678 and do 1 again.
Then chown -R the datastore to 911:0, start the container and verify nothing was changed.
Then chown -R the datastore to 0:911 and do 1 again.
Then chown -R the datastore to 0:911, chmod -R g+w, start the container and verify nothing was changed.

You should check that running the container rootless (still?) works with 1 (/4) (so root here is actually the local user account, PUID/PGID are relative to the user namespace).

I'd probably check that correctly setup NFS/SAMBA mounts work.

Tbh I'm not familiar with the application architecture, so I'm not sure if there are any other paths that should be tested.

Edit: I misremembered what the entrypoint script does. It chowns the user only, not the group. This could be a problem in case someone has mounted a network share intending to use group write permissions (the amount of setups like this is probably very low).

I'm currently considering if it would be a better idea to just check if a specific uid:gid has write permissions for the data dir and if not, chown -R it once.

Edit2: The entrypoint now only checks if the datastore is writable and tries to chown -R it if it's not.

@dgtlmoon
Copy link
Owner

dgtlmoon commented Nov 8, 2024

I'm currently considering if it would be a better idea to just check if a specific uid:gid has write permissions for the data dir and if not, chown -R it once.

better than doing it every time, some datasets are pretty big

This got lost during a rebase, it's not necessary anymore since the env
var is set inside the container by default and changedetection respects it.
@lkubb
Copy link
Author

lkubb commented Nov 9, 2024

Alright, the entrypoint now checks writability of the datastore by the designated PUID:PGID combination and only intervenes if that fails. I'll adjust the proposed test cases above.

# On Windows, create and use a default path.
if os.name == 'nt':
elif os.name == 'nt':
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this stay as if ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it would then not respect the DATASTORE_PATH defined by the user (or the default one set during container build time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Folders and files still owned by root even with PUID and PGID specified
3 participants