Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(RHEL-9322) cgroup: drastically simplify caching of cgroups members mask #295

Open
wants to merge 8 commits into
base: rhel-8.10.0
Choose a base branch
from

Conversation

dtardon
Copy link
Member

@dtardon dtardon commented Jun 22, 2022

Previously we tried to be smart: when a new unit appeared and it only
added controllers to the cgroup mask we'd update the cached members mask
in all parents by ORing in the controller flags in their cached values.
Unfortunately this was quite broken, as we missed some conditions when
this cache had to be reset (for example, when a unit got unloaded),
moreover the optimization doesn't work when a controller is removed
anyway (as in that case there's no other way for the parent to iterate
though all children if any other, remaining child unit still needs it).
Hence, let's simplify the logic substantially: instead of updating the
cache on the right events (which we didn't get right), let's simply
invalidate the cache, and generate it lazily when we encounter it later.
This should actually result in better behaviour as we don't have to
calculate the new members mask for a whole subtree whever we have the
suspicion something changed, but can delay it to the point where we
actually need the members mask.

This allows us to simplify things quite a bit, which is good, since
validating this cache for correctness is hard enough.

Fixes: #9512
(cherry picked from commit 5af8805)

Resolves: #2096371

@mergify mergify bot added the pr/needs-ci Formerly needs-ci label Jun 22, 2022
@systemd-rhel-bot systemd-rhel-bot added the pr/needs-review Formerly needs-review label Jun 22, 2022
@systemd-rhel-bot systemd-rhel-bot changed the title cgroup: drastically simplify caching of cgroups members mask (#2096371) cgroup: drastically simplify caching of cgroups members mask Jun 22, 2022
@systemd-rhel-bot systemd-rhel-bot added the tracker/unapproved Formerly needs-acks label Jun 22, 2022
@mergify mergify bot removed the pr/needs-ci Formerly needs-ci label Jun 22, 2022
@systemd-rhel-bot systemd-rhel-bot removed the tracker/unapproved Formerly needs-acks label Jul 11, 2022
Copy link
Member

@msekletar msekletar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't just cherry pick the single commit from the lengthy series in systemd/systemd#10894 and hope for the best. I have a gut feeling this will introduce regressions in our handling of cgroups. At very least we shouldn't be introducing dead code.

src/core/unit.h Outdated
CGroupMask cgroup_members_mask;
CGroupMask cgroup_realized_mask; /* In which hierarchies does this unit's cgroup exist? (only relevant on cgroupsv1) */
CGroupMask cgroup_enabled_mask; /* Which controllers are enabled (or more correctly: enabled for the children) for this unit's cgroup? (only relevant on cgroupsv2) */
CGroupMask cgroup_invalidated_mask; /* A mask specifiying controllers which shall be considered invalidated, and require re-realization */
Copy link
Member

@msekletar msekletar Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cgroup_invalidated_mask isn't used anywhere. Also a comment contains typo in "specifiying".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted to backport the whole series at first, but abandoned that approach in the end. The PR depends on previous PR(s). It would likely take several days to sort it out and backport everything needed.

@systemd-rhel-bot systemd-rhel-bot removed the pr/needs-review Formerly needs-review label Jul 12, 2022
@systemd-rhel-bot systemd-rhel-bot added tracker/unapproved Formerly needs-acks and removed tracker/unapproved Formerly needs-acks labels Aug 11, 2022
@systemd-rhel-bot systemd-rhel-bot changed the title (#2096371) cgroup: drastically simplify caching of cgroups members mask (#2096371) (#2096371) cgroup: drastically simplify caching of cgroups members mask Aug 21, 2022
@mergify mergify bot added pr/needs-ci Formerly needs-ci and removed pr/needs-ci Formerly needs-ci labels Aug 21, 2022
@jamacku jamacku added this to the RHEL-8.8 milestone Aug 24, 2022
@systemd-rhel-bot systemd-rhel-bot added tracker/unapproved Formerly needs-acks and removed tracker/unapproved Formerly needs-acks labels Aug 31, 2022
@mergify mergify bot added the pr/needs-ci Formerly needs-ci label Sep 21, 2022
@systemd-rhel-bot systemd-rhel-bot changed the title (#2096371) (#2096371) cgroup: drastically simplify caching of cgroups members mask (#2096371) (#2096371) (#2096371) cgroup: drastically simplify caching of cgroups members mask Oct 22, 2022
@systemd-rhel-bot systemd-rhel-bot added pr/needs-review Formerly needs-review and removed pr/needs-review Formerly needs-review labels Nov 19, 2022
@jamacku jamacku changed the title (#2096371) (#2096371) (#2096371) cgroup: drastically simplify caching of cgroups members mask (#2096371) cgroup: drastically simplify caching of cgroups members mask Dec 7, 2022
@systemd-rhel-bot systemd-rhel-bot removed the tracker/unapproved Formerly needs-acks label Dec 8, 2022
@systemd-rhel-bot systemd-rhel-bot changed the title (#2096371) cgroup: drastically simplify caching of cgroups members mask (#2096371) (#2096371) cgroup: drastically simplify caching of cgroups members mask Dec 16, 2022
@jamacku jamacku changed the title (#2096371) (#2096371) cgroup: drastically simplify caching of cgroups members mask (#2096371) cgroup: drastically simplify caching of cgroups members mask Jan 6, 2023
@systemd-rhel-bot systemd-rhel-bot added tracker/unapproved Formerly needs-acks and removed tracker/unapproved Formerly needs-acks labels Jan 13, 2023
…y cgroupsv1 vs. cgroupsv2

(cherry picked from commit 4edd65e)

Related: RHEL-9322
(cherry picked from commit 0adf88b)

Related: RHEL-9322
… controllers being pulled in

This shouldn't make much difference in real life, but is a bit cleaner.

(cherry picked from commit 442ce77)

Related: RHEL-9322
…on a cgroup

This changes cg_enable_everywhere() to return which controllers are
enabled for the specified cgroup. This information is then used to
correctly track the enablement mask currently in effect for a unit.
Moreover, when we try to turn off a controller, and this works, then
this is indicates that the parent unit might succesfully turn it off
now, too as our unit might have kept it busy.

So far, when realizing cgroups, i.e. when syncing up the kernel
representation of relevant cgroups with our own idea we would strictly
work from the root to the leaves. This is generally a good approach, as
when controllers are enabled this has to happen in root-to-leaves order.
However, when controllers are disabled this has to happen in the
opposite order: in leaves-to-root order (this is because controllers can
only be enabled in a child if it is already enabled in the parent, and
if it shall be disabled in the parent then it has to be disabled in the
child first, otherwise it is considered busy when it is attempted to
remove it in the parent).

To make things complicated when invalidating a unit's cgroup membershup
systemd can actually turn off some controllers previously turned on at
the very same time as it turns on other controllers previously turned
off. In such a case we have to work up leaves-to-root *and*
root-to-leaves right after each other. With this patch this is
implemented: we still generally operate root-to-leaves, but as soon as
we noticed we successfully turned off a controller previously turned on
for a cgroup we'll re-enqueue the cgroup realization for all parents of
a unit, thus implementing leaves-to-root where necessary.

(cherry picked from commit 27adcc9)

Related: RHEL-9322
After creating a cgroup we need to initialize its
"cgroup.subtree_control" file with the controllers its children want to
use. Currently we do so whenever the mkdir() on the cgroup succeeded,
i.e. when we know the cgroup is "fresh". Let's update the condition
slightly that we also do so when internally we assume a cgroup doesn't
exist yet, even if it already does (maybe left-over from a previous
run).

This shouldn't change anything IRL but make things a bit more robust.

(cherry picked from commit 1fd3a10)

Related: RHEL-9322
Previously we tried to be smart: when a new unit appeared and it only
added controllers to the cgroup mask we'd update the cached members mask
in all parents by ORing in the controller flags in their cached values.
Unfortunately this was quite broken, as we missed some conditions when
this cache had to be reset (for example, when a unit got unloaded),
moreover the optimization doesn't work when a controller is removed
anyway (as in that case there's no other way for the parent to iterate
though all children if any other, remaining child unit still needs it).
Hence, let's simplify the logic substantially: instead of updating the
cache on the right events (which we didn't get right), let's simply
invalidate the cache, and generate it lazily when we encounter it later.
This should actually result in better behaviour as we don't have to
calculate the new members mask for a whole subtree whever we have the
suspicion something changed, but can delay it to the point where we
actually need the members mask.

This allows us to simplify things quite a bit, which is good, since
validating this cache for correctness is hard enough.

Fixes: #9512
(cherry picked from commit 5af8805)

Resolves: RHEL-9322
This way we can corectly ensure that when a unit that requires some
controller goes away, we propagate the removal of it all the way up, so
that the controller is turned off in all the parents too.

(cherry picked from commit b8b6f32)

Related: RHEL-9322
@github-actions github-actions bot changed the title (#2096371) cgroup: drastically simplify caching of cgroups members mask (RHEL-9322) cgroup: drastically simplify caching of cgroups members mask Apr 9, 2024
@github-actions github-actions bot added tracker/missing Formerly needs-bz and removed tracker/missing Formerly needs-bz labels Apr 12, 2024
@msekletar
Copy link
Member

CI failures looked related but to be sure they are not just flukes @lnykryn force-pushed the feature branch to rerun the checks.

@jamacku
Copy link
Member

jamacku commented May 22, 2024

Unit tests seem to be failing because of #434, But I'm not sure about CentOS CI.

@msekletar
Copy link
Member

CentOS CI failure is concerning. @dtardon can you please have a look?

@github-actions github-actions bot removed the tracker/unapproved Formerly needs-acks label May 22, 2024
@msekletar msekletar closed this Jun 11, 2024
@msekletar msekletar reopened this Jun 11, 2024
@msekletar msekletar closed this Jun 25, 2024
@msekletar msekletar reopened this Jun 25, 2024
@mrc0mmand
Copy link
Member

Oh well, the CentOS CI for this repo is currently FUBAR, since CentOS Stream 8 is no more. I'll need to figure something out to make it working again...

@msekletar
Copy link
Member

@mrc0mmand Maybe we could arrange "Redhat developer subscription for teams", i.e. self-service subscription and then we could generate activation key that we would store as GH secret and then use that to enable all needed repos in UBI RHEL container image (e.g. CRB).

@mrc0mmand
Copy link
Member

@mrc0mmand Maybe we could arrange "Redhat developer subscription for teams", i.e. self-service subscription and then we could generate activation key that we would store as GH secret and then use that to enable all needed repos in UBI RHEL container image (e.g. CRB).

I mean, yeah, that would work for the container stuff (and I plan to actually do that), but this wouldn't work for CentOS CI, since, as the name suggests, it only supports CentOS. So I'll move the C8S job to C9S, which should work just fine. The environment will be different, but it should be enough for making sure we haven't royally screwed something up (we run the same tests internally on actual RHEL 8 anyway).

@mrc0mmand
Copy link
Member

CentOS CI should be back (albeit with some compromises, see systemd/systemd-centos-ci@f415a75). I'll look into the container stuff next.

@mrc0mmand
Copy link
Member

A quick workaround for the current C8S containers is in #440.

@github-actions github-actions bot added tracker/missing Formerly needs-bz and removed tracker/missing Formerly needs-bz labels Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants