Skip to content

Notes on hardening quadlets #4

@ElectricTea

Description

@ElectricTea

Hello! I found this repo from your post on this discussion: containers/podman#13728 (comment)

I'm excited to see this repo, my rootless quadlets are structured similarly to your quadlets, using one pod per dedicated host user.

Here are some notes from my quadlets. I hope you may find some of it useful.

# {{ APP_NAME }}.container

[Unit]

# Make runtime filesystem (excluding volumes) read only
# Can cause problems in most apps.
#ReadOnly=true

# Adding additional Tmpfs volumes (other than default /tmp) is sometimes necessary to make read-only functional for containers that write to other cache paths
# for example, nextjs apps may write to /app/apps/web/.next/cache
#Tmpfs=/app/apps/web/.next/cache

# Prevent privilege escalation after startup
NoNewPrivileges=true

# Drop all linux capabilities to reduce attack surface
# Will likely drop necessary Capabilities. Those must be added back through trial and error.
# https://man7.org/linux/man-pages/man7/capabilities.7.html
DropCapability=ALL

# Allows changing file ownership. Needed when containers modify file/directory ownership (e.g., during package installation or configuration).
# High risk, but commonly required. Can be omitted for read-only or stateless apps.
AddCapability=CHOWN

# Enables the container's root to bypass file permission checks and change file ownership and permissions on files it owns or is allowed to access. Required even if root owns the file.
# High risk, but commonly required. Can be omitted for read-only or stateless apps.
AddCapability=FOWNER

# Allows changing user IDs. Critical for services that start as root and drop to a non-privileged user. 
# Commony required for docker containers intended to start at root. However, if the app doesn’t switch users, it’s unnecessary and increases risk. Can be leveraged to break unpriviledged user namespaces and attack the kernel. (CVE-2023-0386)
AddCapability=SETUID

# Allows changing group IDs. Necessary for applications that switch user/group contexts (e.g., daemons dropping privileges).
# Commonly required for docker containers intended to start as root. Similar high risks as SETUID. Can be omitted for simple apps.
AddCapability=SETGID

# Prevents clearing the set-user-ID and set-group-ID bits when modifying a file. Ensures proper permission inheritance. Needed for correct permission handling in some software.
# Low risk by itself, higher risk with FOWNER and SETUID, commonly required. Can be omitted for read-only or stateless apps.
AddCapability=FSETID

#AddCapability=KILL

# Allows binding to ports below 1024 (e.g., HTTP/80, HTTPS/443).
# Required if the application listens on priviledged ports ( < 1024 ) like 80/443 within the container. Otherwise, unnecessary.
AddCapability=NET_BIND_SERVICE   

# Security Note: Avoid adding capabilities like CAP_SYS_ADMIN (CVE-2018-18955, CVE-2022-0185, CVE-2022-0492), CAP_NET_ADMIN (CVE-2017-7184, CVE-2024-1086), CAP_NET_RAW (CVE-2017-7308), or CAP_DAC_OVERRIDE (CVE-2022-0492, CVE-2023-0386) unless strictly necessary, as they increase the linux kernel's attack surface.
# CHOWN, FOWNER, SETUID, SETGID, and FSETID are commonly required for containers to function, but should be disabled if possible.

# Allows a process to bypass file read, write, and execute permission checks.
# High risk, but sometimes required. Many PHP web applications (like those using PHP-FPM) require it to create or access Unix sockets.
#AddCapability=CAP_DAC_OVERRIDE

# Allows a proccess to change it's and it's child proccess' root directory. Used by the chromium sandbox. 
# High risk, but not known to be explotied in isolation. Required for chrome browser sandbox.
#AddCapability=SYS_CHROOT
# {{ APP_NAME }}-internet.network

[Network]
NetworkName={{ APP_NAME }}-internet

# true = no internet connection
# false = yes internet connection
# (false,true)
Internal=false

# prevent container on network A from interacting with containers on network B via direct IP connections. This requires both networks to have this option set to true/strict.
# in the past true didn't work due to a bug in IPTables, but strict worked fine.
# (false,true,strict)
Options=isolate=strict

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions