This is an exploratory repo with information on how to use Linux capabilities(7) in AKS.
-
After Kubernetes 1.21 it looks like Capabilities(7) will only work when
runAsUser
is set to0
- meaning,root
.This is the code that prevents any user other than
root
to have capabilities. This was added by the commit referenced here:// Clear all ambient capabilities. The implication of non-root + caps // is not clearly defined in Kubernetes. // See https://github.com/kubernetes/kubernetes/issues/56374 // Keep docker's behavior for now. specOpts = append(specOpts, customopts.WithoutAmbientCaps, customopts.WithSelinuxLabels(processLabel, mountLabel), )
-
On the previous note, we can add/remove capabilities to
root
- which essentially removes a lot of the superpowers thatroot
have on by default (e.g.: cap_net_admin).
In the example below we will be granting cap_ipc_lock
to the running user (root) and nothing else.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: gbbapp
name: gbbapp
namespace: ns-gbb
spec:
replicas: 1
selector:
matchLabels:
app: gbbapp
template:
metadata:
labels:
app: gbbapp
spec:
containers:
- name: gbbapp
image: gbbapp/k8s:cfa
command: ["/bin/bash"]
args: ["-c", "sleep 3600"]
securityContext:
runAsUser: 0
capabilities:
drop: ["ALL"]
add: ["IPC_LOCK"]
It is possible to add capabilities during the build process with docker/podman. With this approach you can remove the RunAsUser
parameter altogether. The capabilities added there will persist when the image runs as a container. The following example adds cap_ipc_lock
to python3.8
-
Create a Dockerfile
FROM ubuntu RUN apt-get update && apt-get install -y libcap2-bin && \ setcap cap_ipc_lock+eip /usr/bin/python3.8 CMD ["/bin/bash"]
-
Create an ACR instance
RESOURCE_GROUP_NAME=rg-setcomp LOCATION=westus ACR_NAME=myacrname az group create --name ${RESOURCE_GROUP_NAME} --location ${LOCATION} az acr create -n ${ACR_NAME} -l ${LOCATION}
-
Add the container to ACR
az acr build -r ${MY_ACR}/setcomp:{{.Run.ID}} .
-
Exec into the container and verify that the capability was added by running the
getcap
command against a binary, which in this case ispython3.8
:$ getcap /usr/bin/python3.8 /usr/bin/python3.8 = cap_ipc_lock+eip