Interrupt abstractions #71

alsrdn · 2022-01-25T18:23:01Z

This PR addresses requirements from #68 and adds interrupt abstractions to vm-device allowing devices to be developed independently from the vmm and to use a common interface for configuring and triggering interrupts.

This PR is a simplified version of #23 and #21 that allows for defining of custom behaviour in the VMM for different interrupt types and also interrupt groups and also allows for easy extension and creation of other interrupt types in the future by combining Interrupt super traits and configuration types.

I've included a small example on the usage of the traits.
Another example on how it might be used in Firecracker can be found in these draft commits: alsrdn/firecracker@e1d6b7b and alsrdn/firecracker@d59bd41

Cargo.toml

README.md

src/interrupt/legacy.rs

src/interrupt/msi.rs

src/interrupt/mod.rs

src/interrupt/trigger/mod.rs

src/interrupt/mod.rs

andreeaflorescu · 2022-01-26T15:50:03Z

src/interrupt/legacy.rs

+use crate::interrupt::ConfigurableInterrupt;
+
+/// Definition for PCI INTx pins.
+#[derive(Copy, Clone, Debug)]


I've just been reading about some good practices in Rust, and was thinking to start applying it going forward. One suggestion from there is to eagerly implement common traits. Do you mind adding at least the ones that you can derive from the list? PartialEq is super useful with testing for example. This applies to all the other structure definitions.

Added common traits where applicable.

Btw, the rust-vmm crates all violate the snake case naming scheme proposed by https://rust-lang.github.io/api-guidelines/naming.html. I guess this is not fixable anymore?

src/interrupt/trigger/eventfd.rs

examples/interrupts/device.rs

src/interrupt/mod.rs

gsserge · 2022-02-01T21:41:14Z

Please adjust test coverage to make CI happy, otherwise LGTM.

gsserge · 2022-02-02T15:28:23Z

@jiangliu @studychao could you please also take a look at the PR, thanks!

blitz · 2022-02-03T14:30:02Z

src/interrupt/mod.rs

+    }
+
+    /// Disable generation of interrupts from this interrupt source.
+    fn disable(&self) -> Result<()> {


This trait mixes up multiple aspects of an interrupt:

The part that is seen by a device (trigger)

A super-trait of Interrupt where the underlying notifier (eventfd) can be extracted (notifier)

The part that is operated by interrupt controllers (enable, disable, ack?)

These should be separate traits to simplify implementation and re-usability.

Yes, I think separating the notifier related functionality would improve the interface at least for the cases where the notifiers are not available.
For enable/disable I think moving them to another trait has diminishing returns. It's true that those methods are related to interrupt control but that can come either from VMM code or from a device crate code (e.g. a PCI crate which handles the parsing of MSI configuration).

I've moved the notifier methods to a different trait. Please take a look.

For enable/disable I think moving them to another trait has diminishing returns. It's true that those methods are related to interrupt control but that can come either from VMM code or from a device crate code (e.g. a PCI crate which handles the parsing of MSI configuration).

I see this from the point of what I have to implement for example to unit test a device that toggles interrupts. A simple device would take an interrupt pin as an input, which can be anything that implements the trigger method. When I use a mock of this for unit testing with the current trait, I have to implement lots of unrelated things that the device should not even be able to call at all.

So in my mind it's very important to separate the two sides: One set of functions that the device is supposed to use. One set of functions that the interrupt controller is supposed to use. These should be two traits that also have no relation to each other. That you can implement them on the same struct is an implementation detail.

Or to put it a different way: The device should only be able to use interfaces that it is actually supposed to be using. This prevents people who may not fully understand the interrupt logic from using the interfaces the wrong way, which is a big win.

Maybe as clarification: We are currently using rust-vmm code for parts of our virtualization stack at Cyberus Technology internally. We are interested in making it more generic and thus a bit more friendly to hypervisors that don't necessarily work like KVM. :)

This is also a reason why I'm a bit concerned about mixing up concerns in trait implementations and keeping it simple, straightforward and without guess work for the implementor of traits, because this has already caused some churn for us when we onboarded with vm-memory.

Gotcha. I think it's good to have the separation for the use cases you mentioned. My simplifying assumption was that most implementations would mix both pure emulation functionality and some interrupt control code so it would be more easier to pass Interrupt objects that have more optional functionality in one larger interface.
But I'm definitely not against splitting them up. Here's what I propose (also merging the discussion about level-interrupts below):

Interrupt traits used by device emulation to generate notifications:

pub trait EdgeInterrupt { fn trigger(&self) -> Result<()>; } pub trait LevelInterrupt { fn assert(&self) -> Result<()>; fn de_assert(&self) -> Result<()>; }

Interrupt control traits used by interrupt controllers and interrupt configuration code:

pub trait Interrupt { /// Enable generation of interrupts from this interrupt source. fn enable(&self) -> Result<()>; /// Disable generation of interrupts from this interrupt source. fn disable(&self) -> Result<()>; } pub trait ConfigurableInterrupt: Interrupt { /// Type describing the configuration spec of the interrupt. type Cfg; /// Update configuration of the interrupt. fn update(&self, config: &Self::Cfg) -> Result<()>; /// Returns the current configuration of the interrupt. fn get_config(&self) -> Result<Self::Cfg>; } /// Trait for interrupts that can be masked or unmasked. pub trait MaskableInterrupt: Interrupt { fn mask(&self) -> Result<()>; fn unmask(&self) -> Result<()>; } /// Trait implemented by devices through which a resampled interrupt that was cleared /// can determine if the device still requires service pub trait InterruptStatusChecker { fn is_active(&self) -> bool; } /// Trait for Interrupts that can be automatically retriggered when resampled /// Uses InterruptStatusChecker pub trait AutoRetriggerInterrupt: Interrupt { fn set_status_checker(&self, Arc<dyn InterruptStatusChecker>) -> Result<()>; } pub trait InterruptSourceGroup: Send { /// [...] }

Conversion traits to get the underlying notifiers (for the irqfd use-case)

pub trait AsRefTriggerNotifier { type NotifierType; fn trigger_notifier(&self) -> &Self::NotifierType; } pub trait AsRefResampleNotifier { type NotifierType; fn resample_notifier(&self) -> &Self::NotifierType; }

@blitz WDYT?

I'm not sure I understand the difference between enable/disable and mask/unmask, but in general I think this is pretty good. :)

This suggestion looks good to me. The only thing I'd like to change is the de_assert() method name. Maybe unassert() or clear()?

I've updated the PR with the changes discussed here. @gsserge @blitz PTAL

blitz · 2022-02-03T14:31:01Z

src/interrupt/mod.rs

+        None
+    }
+
+    /// Called back when the CPU acknowledges the interrupt.


What does this mean? Is it called when the interrupt is logged into the LAPIC IRR?

Need to update the doc a bit here. It's actually an EOI callback and it would be used for shared level-sensitive interrupts in order to inject the interrupt if the device still requires it (e.g. for INTx interrupts). It's equivalent to the ack_notifier() the difference being that the latter would enable the resampled semnatics with external components like vfio.

For level-triggered interrupts the trigger method seems not to be the right abstraction to me. In this case, the device would need a mechanism to assert/de-assert the interrupt line for correct emulation. Optimizations to avoid assert/deassert messages when the interrupt cannot be injected anyway would be a second step.

In any case, the flow for handling level triggered interrupts should be documented. I wouldn't have seen how this is supposed to work from the code alone.

Add a series of Traits and structs for defining the interface for interrupts. The goal is to allow individual development of device crates that can work with different types of interrupt mechanisms. The interrupt mechanism can be implemented in the VMM based on the platform or the available hardware (e.g. CPU, Interrupt controller). These set of traits allow devices to use a simple interface through which they can request, release, configure interrupts and also signal events to the VMM. Signed-off-by: Alexandru-Cezar Sardan <alsardan@amazon.com>

Add a brief description of the design of the Interrupt abstractions. Signed-off-by: Alexandru-Cezar Sardan <alsardan@amazon.com>

Interrupt abstractions currently don't have unit tests so coverage is decreased. Signed-off-by: Alexandru-Cezar Sardan <alsardan@amazon.com>

blitz · 2022-02-24T14:16:53Z

src/interrupt/msi.rs

+#[derive(Copy, Clone, Debug, Default, Eq, PartialEq)]
+pub struct MsiIrqConfig {
+    /// High address to delivery message signaled interrupt.
+    pub high_addr: u32,


Why not pub addr: u64?

This follows the PCI spec. See sections 6.8.1.4, 6.8.1.5 from here https://www.cl.cam.ac.uk/~djm202/pdf/specifications/pci/PCI_LB3.0_CB-2-6-04.pdf

The spec describes a 64-bit Message Address that is represented in MSI-X as hi/lo 32-bit parts. It's not really important, but it's weird to represent it here as 32-bit hi/lo parts as well, when you can just as well make it the 64-bit address that it is.

MSIs are normal data writes on the bus. So when they talk of "address" they literally mean an address where the device will write what's in the data field.

blitz

Everything looks documented, but I still don't feel like I understand the usecases that drive the complexity.

For devices that want to trigger MSIs, how does the interface look for them. Is there an example device somewhere?

blitz · 2022-02-24T14:25:52Z

src/interrupt/mod.rs

+/// The group allows a device to request and release interrupts and perform actions on the
+/// whole collection of interrupts like enable and disable for cases where enabling or disabling
+/// a single interrupt in the group does not make sense. For example, PCI MSI interrupts must be
+/// enabled as a group.


Is this a KVM limitation or something that comes as a requirement from PCI passthrough?

MSI or MSI-x is enabled by setting the enable bit in the message control struct. This refers to the whole set of MSI interrupts. You can mask each individual interrupt but you can't enable/disable them individually.

Is there a functional difference between masked and disabled MSI-X vectors from the interrupt subsystems perspective?

Some controllers may have different registers to distinguish masked vs enabled interrupts. Take a look here: https://docs.microsoft.com/en-us/windows-hardware/drivers/gpio/gpio-interrupt-masks

There may also be functional differences in what happens when a masked interrupt is asserted. In some cases interrupt is not delivered when asserted until it is unmasked, whereas asserting a disabled interrupt should not keep the pending state.

This document describes behavior of a (GPIO) interrupt controller. So what you are saying is that the implementation of this trait needs to know what kind of interrupt controller it is connected to. This is fine, I'm just trying to wrap my head around things. :)

blitz · 2022-02-24T14:28:01Z

src/interrupt/mod.rs

+    fn allocate_interrupts(&mut self, size: usize) -> Result<()>;
+
+    /// Release all interrupts within this group.
+    fn free_interrupts(&mut self) -> Result<()>;


When would allocate and free be called? Am I allowed to call any of the other methods when the group has not been allocated or already freed?

allocate_interrupts and free_interrupts should be called from interrupt setup code to reserve/release interrupts.
Typically you would return an error when enabled is called on an empty group in the trait implementation.

should be called from interrupt setup code

What setup code are you referring to here? The whole "flow" of this API is not really clear to me yet.

Take a look here: https://github.com/alsrdn/firecracker/blob/ab0c9642162f8c749c516d02b43b144a91baa33e/src/vmm/src/device_manager/legacy.rs#L84

Also here: https://github.com/alsrdn/firecracker/blob/ab0c9642162f8c749c516d02b43b144a91baa33e/src/vmm/src/builder.rs#L883

This is from the virtio code you linked to:

let mut interrupt_group = vmm.interrupt_manager.get_new_legacy_group().unwrap(); interrupt_group.allocate_interrupts(1).unwrap(); let irq = interrupt_group.get(0).unwrap(); irq.update(&LegacyIrqConfig { interrupt_line: None, interrupt_pin: None, }) .unwrap(); let irq_config = irq.get_config().unwrap(); interrupt_group.enable().unwrap();

This is frankly a terrible interface for legacy interrupt pins and should not end up in code that is intended for use in other VMMs. It has multi-step construction, it violates the "simple things should be simple" goal and the code is completely non-obvious:

Why do you need to "allocate" after getting the "group" (for one interrupt?) from the manager?

Why do need to "update" the group with basically no information right after creating it?

Why do you need to get_config the config right after sticking the empty config into it?

Why do you need to "enable" the interrupt?

Why is it ok to unwrap all the results?

My advice here is to check out the interrupt setup in existing emulators/VMMs, for example qemu. It is very readable and this should especially be the goal for re-usable code.

Maybe the way forward is to merge this and iterate on it, but from my experience interfaces rarely become cleaner over time. I don't think I can contribute more here, unfortunately.

blitz · 2022-02-24T14:29:22Z

src/interrupt/msi.rs

+pub trait MsiInterrupt: ConfigurableInterrupt<Cfg = MsiIrqConfig> + MaskableInterrupt {}
+
+/// Blanket implementation for Interrupts that use a MsiIrqConfig.
+impl<T> MsiInterrupt for T where T: ConfigurableInterrupt<Cfg = MsiIrqConfig> + MaskableInterrupt {}


I have really no idea how this would be used by a device. Is there an example device to look at somewhere?

This is used for convenience. It becomes tedious to pass around a long list of trait bounds for well known interrupt types.

alsrdn · 2022-02-24T16:17:24Z

Everything looks documented, but I still don't feel like I understand the usecases that drive the complexity.

For devices that want to trigger MSIs, how does the interface look for them. Is there an example device somewhere?

I currently removed the example to keep this PR more focused following the discussion here #71 (comment).
In order to trigger MSIs in a device emulation, configuration code should allocate, configure and enable the interrupts and pass an object with the EdgeInterrupt trait to the emulation code that will use trigger() to trigger the interrupt.
Here's an example of legacy interrupts in firecracker: https://github.com/alsrdn/firecracker/blob/ab0c9642162f8c749c516d02b43b144a91baa33e/src/vmm/src/device_manager/legacy.rs#L84 (it's outdated right now since it was using the old traits).

blitz · 2022-02-25T10:37:10Z

Everything looks documented, but I still don't feel like I understand the usecases that drive the complexity.
For devices that want to trigger MSIs, how does the interface look for them. Is there an example device somewhere?

I currently removed the example to keep this PR more focused following the discussion here #71 (comment). In order to trigger MSIs in a device emulation, configuration code should allocate, configure and enable the interrupts and pass an object with the EdgeInterrupt trait to the emulation code that will use trigger() to trigger the interrupt. Here's an example of legacy interrupts in firecracker: https://github.com/alsrdn/firecracker/blob/ab0c9642162f8c749c516d02b43b144a91baa33e/src/vmm/src/device_manager/legacy.rs#L84 (it's outdated right now since it was using the old traits).

I think it's fine to keep the example out of this PR to increase focus, but there should be a matching device somewhere to understand how these interfaces are supposed to be used. For the pin-based interrupts everything is reasonably simple, I would still like to understand how the MSI interfaces are meant to be used though.

blitz · 2022-02-25T10:41:11Z

src/interrupt/legacy.rs

+pub struct LegacyIrqConfig {
+    /// Input of the system interrupt controllers the device's interrupt pin is connected to.
+    /// Implemented by any device that makes use of an interrupt pin.
+    pub interrupt_line: Option<u32>,


Would this be used by PCI devices to set their interrupt line field or are there other uses? Typically, the interrupt line field in the config space is just a normal R/W field that is written during boot by firmware. So device emulation should not be concerned with this value.

For a PCI device using an INTx interrupt interrupt_line may not be of significance for device emulation. But this value is of significance for setting up other components like the MPtable.
Also you would use interrupt_line to configure a specific line that you want the interrupt assigned to (see https://github.com/alsrdn/firecracker/blob/ab0c9642162f8c749c516d02b43b144a91baa33e/src/vmm/src/device_manager/legacy.rs#L89).

alsrdn · 2022-02-25T12:56:12Z

Everything looks documented, but I still don't feel like I understand the usecases that drive the complexity.
For devices that want to trigger MSIs, how does the interface look for them. Is there an example device somewhere?

I currently removed the example to keep this PR more focused following the discussion here #71 (comment). In order to trigger MSIs in a device emulation, configuration code should allocate, configure and enable the interrupts and pass an object with the EdgeInterrupt trait to the emulation code that will use trigger() to trigger the interrupt. Here's an example of legacy interrupts in firecracker: https://github.com/alsrdn/firecracker/blob/ab0c9642162f8c749c516d02b43b144a91baa33e/src/vmm/src/device_manager/legacy.rs#L84 (it's outdated right now since it was using the old traits).

I think it's fine to keep the example out of this PR to increase focus, but there should be a matching device somewhere to understand how these interfaces are supposed to be used. For the pin-based interrupts everything is reasonably simple, I would still like to understand how the MSI interfaces are meant to be used though.

For MSI it works the same way:

Create a MSI interrupt group in your VMM
Pass the group to the crate that is managing your device
Create interrupts within this group
Configure each interrupt
Enable the group if required
Pass each interrupt to the corresponding code emulating the queues
Trigger interrupts as needed

andreeaflorescu

At this point it's better to merge the PR, and iterate on the interfaces/responsibilities in a future release to unblock the future work.

alsrdn requested review from alexandruag, gsserge and jiangliu as code owners January 25, 2022 18:23

gsserge reviewed Jan 25, 2022

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

gsserge reviewed Jan 25, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved