Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuning of individual kernel threads #628

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

adriaan42
Copy link
Contributor

@adriaan42 adriaan42 commented Apr 17, 2024

In combination with #596 and #580, this PR implements the third feature needed to dynamically tune all relevant aspects of a realtime application using a dedicated HW device (typically a NIC).

Two things might need some discussion:

  • In my implementation I kept the basic idea that one instace can cover a number of different groups of threads. That makes it easy to migrate from the current scheduler plugin, but means we still need _has_dynamic_options, which is marked as a hack in plugins/base.py. The alternative would be to have one plugin instance per "group", which would make the profiles much longer.
    group.ktimers=0:f:2:*:^\[ktimers
    would become something like
    [kthread_ktimers]
    type=kthread
    regex=^ktimers
    policy=fifo
    sched_prio=2
    affinity=*
  • I copied the approach of using perf to monitor for creation of new threads. That means that when running both the scheduler plugin and the kthread plugin, we'd have two threads doing the same thing. For my applications that's not a problem, because I no longer use the scheduler plugin at all:
    • scheduler handles three things: IRQ affinities, kernel threads, and userland threads
    • For IRQ affinities I can use the irq plugin
    • For kernel threads I can use kthread
    • For userland threads I use systemd and cgroupv2, and I don't want TuneD to touch them

tuned/plugins/plugin_kthread.py Fixed Show resolved Hide resolved
tuned/plugins/plugin_kthread.py Fixed Show resolved Hide resolved
tuned/plugins/plugin_kthread.py Show resolved Hide resolved
@yarda
Copy link
Contributor

yarda commented May 23, 2024

but means we still need _has_dynamic_options, which is marked as a hack in plugins/base.py.

It's OK for me, in long-term it's a candidate for rewrite/refactor, but there are other plugins using it as well. We will probably keep the idea and if we change the implementation, this could be then updated in all affected plugins the same way.

@yarda
Copy link
Contributor

yarda commented May 23, 2024

Regarding the cgroups, there is support for cgroups v1 in the scheduler plugin and we would also like to add support for the v2 for completeness. It could be useful for somebody.

It's OK if you are not using some plugin. We even wanted to add global configuration option allowing selective disablement of specific plugins in the stock profiles.

@adriaan42
Copy link
Contributor Author

Regarding the cgroups, there is support for cgroups v1 in the scheduler plugin and we would also like to add support for the v2 for completeness. It could be useful for somebody.

I found the whole cgroup topic to be rather tricky, because in modern systems, SystemD is the "cgroup manager", and it owns (by convention) the cgroup tree. So any creation of new cgroups should happen via SystemD, and can then use Delegation to create further sub-groups.

I've had some success with:

  • set AllowedCPUs on all the default slices (system.slice, user.slice, init.scope) to restrict all "normal" processes. This to some extent replaces the isolcpus= kernel option.
  • Create an isolated.slice using SystemD, with access to the desired CPUs, and then use Slice=isolated in my service file (or systemd-run --slice=isolated when launching from a shell) to gain access to the isolated CPUs.

But simply having TuneD move processes around seems like it could have unwanted side-effects, and should be handled with care...

Comment on lines 276 to 288
#def _instance_apply_static(self, instance):
# if self._instance_count == 0:
# # scan for kthreads that have appeared since plugin initialization
# self._kthread_scan(initial=False)
# self._perf_monitor_start()
# self._instance_count += 1
# super(KthreadPlugin, self)._instance_apply_static(instance)

#def _instance_unapply_static(self, instance, rollback):
# super(KthreadPlugin, self)._instance_unapply_static(instance, rollback)
# self._instance_count -= 1
# if self._instance_count == 0:
# self._perf_monitor_shutdown()

Check notice

Code scanning / CodeQL

Commented-out code

This comment appears to contain commented-out code.
@adriaan42
Copy link
Contributor Author

Update:

@yarda did you already have a chance to review this PR?

@yarda
Copy link
Contributor

yarda commented Oct 14, 2024

Sorry for the delay, I am back on it.

The 5d28337 LGTM.

Regarding the kthread plugin is there a specific need for the proposed syntax:

group.ksoftirqd=0:f:2:*:^\[ksoftirqd

Wouldn't be better to use one group per instance? I.e. the same affinity/sched_opts setting for individual instance, e.g.:

[ksoftirqd]
type=kthread
devices_udev_regex=^\[ksoftirqd
setting=0:f:2:*

Or:

[ksoftirqd]
type=kthread
devices_udev_regex=^\[ksoftirqd
schedopts=SCHED_FIFO
affinity=*

Then for the priority the builtin instance priority option could be used. The affinity if unset could default to *.

You could also specify multiple regexes per instance:

[ksoftirqd_and_ksmd]
type=kthread
devices_udev_regex=^\[(ksoftirqd)|(ksmd)
schedopts=SCHED_FIFO

@adriaan42
Copy link
Contributor Author

Wouldn't be better to use one group per instance? I.e. the same affinity/sched_opts setting for individual instance, e.g.:

I thought about this when I wrote the plugin, and didn't like it because

  • I like the "default" of having only one instance per plugin in the profile, unless doing some specific optimizations. That instance has the name of the plugin, so type= is not needed. In the current profiles only few very specific cases need multiple instances (e.g. ThunderX).
  • The approach with multiple instances would make the common case (what's currently used in the realtime profiles) much more complex to express.

I find the current format in the scheduler plugin quite clear, except for minor points I've changed:

  • unify policy and priority into one option (f50 instead of f:50) because setting those separately makes no sense to me
  • remove the braces [] from thread names (the scheduler plugin adds them to identify kernel threads, but here we have only kernel threads, so no need for this)
[ksoftirqd]
type=kthread
devices_udev_regex=^\[ksoftirqd
setting=0:f:2:*

Also re-using the existing devices_udev_regex is not very clean, as we're not dealing with udev Devices here. So the setting we're using in the profiles would be "wrong", and internally we'd need to create pyudev.Device objects just to use the existing mechanism.

@yarda
Copy link
Contributor

yarda commented Nov 6, 2024

Here in one instance there can be multiple "devices" with multiple different tuning. I don't like that it breaks the used logic concept that in each instance there are "devices" with the same tuning. I am afraid that it could lead to multiple problems later (e.g. when moving "devices" between instances through the API).

But maybe it's just my personal preference, @zacikpa, @jmencak what's your opinion?

@zacikpa
Copy link
Contributor

zacikpa commented Nov 7, 2024

I'm personally fine with the implementation as it is now (since it's inspired by the groups.* implementation in the scheduler plugin), but I would not refer to the kthreads as devices anywhere in the code/comments. In other words, the new plugin would not support "devices" in the TuneD sense, similarly to the scheduler plugin.

@yarda
Copy link
Contributor

yarda commented Nov 7, 2024

No device plugin, like e.g. the sysfs plugin, this may work.

@adriaan42
Copy link
Contributor Author

No device plugin, like e.g. the sysfs plugin, this may work.

One of the main points of the new plugin is to allow dynamic changes of the tuning (through the instance_[create|destroy] dbus calls). That only works on device-plugins (based on hotplug.Plugin).

@zacikpa
Copy link
Contributor

zacikpa commented Nov 11, 2024

Huh, taking back what I said here, I say we should adjust instance_create and instance_destroy to work with any plugin, not just child classes of hotplug.Plugin.

IIUC, you can't and don't plan to use instance_acquire_devices with this plugin, am I right? (Because it does not support what TuneD calls "devices").

@adriaan42
Copy link
Contributor Author

Huh, taking back what I said here, I say we should adjust instance_create and instance_destroy to work with any plugin, not just child classes of hotplug.Plugin.

The change itself is probably simple, but I'm not sure what would happen for any of the existing non-hotplug Plugins if multiple instances were created. I expect those instances would just interfere with each other and break things.

IIUC, you can't and don't plan to use instance_acquire_devices with this plugin, am I right? (Because it does not support what TuneD calls "devices").

instance_acquire_devices could be used, but it does not make much sense in this case. (One could force the transfer of a "device" to an instance that does not match it, but then no tuning would be applied).

Treating kthreads as "devices" lets me reuse a lot of nice infrastructure. But it would of course be possible to do this all within the plugin.

Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
Calls to our _instance_[un]apply_static() are unreliable [1], so we
can't use them to keep track of the active-instance count, to
start/stop the perf monitor thread on demand. Instead, this keeps
the thread running continuously.

[1] redhat-performance#662

Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
@adriaan42
Copy link
Contributor Author

Huh, taking back what I said here, I say we should adjust instance_create and instance_destroy to work with any plugin, not just child classes of hotplug.Plugin.

The change itself is probably simple, but I'm not sure what would happen for any of the existing non-hotplug Plugins if multiple instances were created. I expect those instances would just interfere with each other and break things.

IIUC, you can't and don't plan to use instance_acquire_devices with this plugin, am I right? (Because it does not support what TuneD calls "devices").

instance_acquire_devices could be used, but it does not make much sense in this case. (One could force the transfer of a "device" to an instance that does not match it, but then no tuning would be applied).

Treating kthreads as "devices" lets me reuse a lot of nice infrastructure. But it would of course be possible to do this all within the plugin.

@zacikpa I just pushed a draft that bases the plugin on base.Plugin instead of hotplug.Plugin. The changes in controller.py probably need some more work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants