-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the Docker elastic-agent user from the root group #4087
Comments
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
Should it be the new default or should it be something optional in the same way we are doing for this unpriviledged option? |
After chatting with Craig we should probably ship two different docker images, the one we already have today and a new one compliant with this request. |
I think we need to establish why this exists first, that is not entirely clear to me. If we consider removing our user from the root group a breaking change, then yes we'll probably need to publish a separate image without it. For Docker ideally the unprivileged one would be the default but I'm not sure if this is possible depending on what it breaks. |
I don't think we should publish two docker containers. A possible solution is to add a container option that drops the root group from the elastic-agent in the docker entrypoint script before it spawns the Elastic Agent. That would allow us to have the Elastic Agent as running as not having root privileges. |
Making it a runtime option may still see us trigger container security scans if those are only looking at the layers in the container before it is run. That is pretty much my only concern with that approach, that we will be perpetually explaining this for security scan results. |
That can easily be explained with documentation, better than supporting and releasing another container image. |
I am more than happy shipping a unique container though 👍🏼 |
Files are owned by the root group to follow Openshift recommendations, the change in Beats was done in elastic/beats#18873, more info in the linked issues. On this change we also added the default user to the root group. The referred recommendations can be found in https://docs.openshift.com/container-platform/3.11/creating_images/guidelines.html#openshift-specific-guidelines, in the section about supporting arbitrary user IDs:
IIRC, without these changes our images cannot be used in some securized environments, like Openshift, without giving them full privileges. What I am not sure now is if adding the default user to the root group was strictly needed. I think this was needed to avoid issues when running in more vanilla environments where this user is used, but not sure, we'd need to test again. |
I would also prefer to release single images. If we decide to remove the user from the root group, there are some things we can play with:
In any case we must keep the files owned by the root group, for the images to work on Openshift and so on. |
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
hello 👋 today I had a discussion with @cmacknz and @strawgate about something initially irrelevant but with what I showed them it became relevant. Full disclosure, I wasn't aware of this issue as Craig only mentioned it to me today, but I kinda tackled the same issue. So what I did to support rootless agent is the following:
With all the above I have managed this
and when I want agent to still run as rootless but be able to read logs of other containers, all I have to do is specify "DAC_READ_SEARCH" capability in k8s manifest, make sure that the volumemount of the hostpath is readonly (the readOnly is another layer of defense) and I get something like this
this approach can easily be incorporated in the existing single agent container, this is how I have done it. Sorry for the long comment but do let me know how the above sounds to you 🙂 PS: 1000 uid is not mandatory for the above to work, it can play with any UID as long as the proper capabilities are given to the binary and specified as allowed in the k8s manifest |
@pkoutsovasilis one doubt I have about your proposal: how does it address the case where the agent/beats needs to read root owned files like for OpenShift? Another one I have, why does it need another binary? |
it chowns them first? it can chown anything since it has the CAP_CHOWN capability. Now this fits the bill for inside-container files, for host mount paths (that are not solely used by agent such as the state) where we can't arbitrarily chown files we will add the "DAC_READ_SEARCH" capability in the manifest yml and mount it as readonly. What am I missing?
I am gonna quote my original comment
Because of the above when you specify a capability in the manifest it means that your process can raise it's capabilities to get it, not that it runs from the start with them. The only way for a non-root process to raise it's capabilities is by file capabilites set to the binary/file level. Makes sense now?
I think that the extra binary can be incorporated in the code of agent and you know have a single binary. However during my experimentation I didn't want to rebuild the whole agent image and I kinda prefer to have this segregated from the agent code as an easy thing to discard if necessary |
Drilling into this, are there implications to this approach for when we need to store agent state in a writable hostPathVolume in the node file system? This is what we default to today so that we can store integration state (file offsets, API cursors) and the Fleet enrollment state. Was this in the setup you tested? elastic-agent/deploy/kubernetes/elastic-agent-managed/elastic-agent-managed-daemonset.yaml Lines 156 to 161 in ca3d1f2
|
@pkoutsovasilis It would be great to see the manifest and the code to help me better understand what is being done and the manifest attributes so their is no confusion. I think the technical aspect of the capabilities and the adjustment of bounding set and effective set make sense, I question the need of an extra binary. Being that the Elastic Agent already has a sub-command
I have a few more questions:
|
I am gonna forward them to you 🙂
sure we can give the single-binary approach a try; one thing to keep in mind with this approach is that elastic-agent binary now will have to have initially the capabilities at a file level "cap_chmod" and "cap_setfcap" which means that you can't even run
No particular reason I just saw that the existing agent image is setting this in its entrypoint so I just thought of adding it 😁
yes the agent state hostpath was my main motivation to have the cap_chown from the beginning. It just happened that Craig also mentioned this particular issue and I extended it to also chown inner-container files. In simple words, yes we can chown anything we want and it's not read only 😄 |
That might be a good reason to keep it separate then. Just for more clarity on this are you saying that if a user where to
Okay.
Good! |
if PS: when I say two binaries I mean keep the elastic-agent binary as is and make another one that does all this logic of chowing and process capabilities setting. This second one, is the compiled equivalent of the existing docker-entrypoint.sh |
If it won't fail, commands like |
hi 👋 the respective PR that initially aimed to remove elastic-user from group 0 is merged. However, the plan has change a bit and this issue here will be fulfilled by the new wolfi-based agent container images. As a result, maybe this issue should now move under @rdner who is leading the wolfi images initiative? |
Yes in the Wolfi base image the elastic-agent user is no longer in group 0 by default: elastic-agent/dev-tools/packaging/templates/docker/Dockerfile.elastic-agent.tmpl Lines 152 to 158 in b7e95fe
Separately #4925 has enabled changing the user+group that runs the agent container, so that the ubuntu container can be removed from the root group as needed. Closing. |
The non-root elastic-agent user with UID 1000 is a member of the root group 0 in our Docker file.
This is configured here with the
--gid 0
argument touseradd
:elastic-agent/dev-tools/packaging/templates/docker/Dockerfile.elastic-agent.tmpl
Lines 115 to 118 in 17f0480
The root group on an Ubuntu container does not have any particular privileges besides the ability to read files owned by the root group. Being a member of the root group does occasionally cause security scans to consider our user as "running as root". This is confusing now that we are developing an Elastic Agent that can be installed as a non-root user (including both UID and GID). The container should also follow this convention. Today our container
elastic-agent
user can read files an agent installed withelastic-agent install --unprivileged
cannot.We should remove the default addition to the root group in our container. This may cause problems in situations where directories or volumes mounted into the container from the host system for monitoring require membership in the root group to read the files in the volume. Possibly examples of this situation are using the elastic-agent container to monitoring node metrics or system logs on Kubernetes by mounting the file system of the node into the elastic-agent container.
The text was updated successfully, but these errors were encountered: