Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: support FIQs and use them for TLB shootdown IPIs #1039

Merged
merged 24 commits into from
Sep 24, 2023

Conversation

NathanRoyer
Copy link
Member

Sometimes when we tried spawning an interactive console in theseus on aarch64, by sending keystrokes through the UART port, theseus hanged.

We then found out that a complicated scenario was taking place:

  • the console task was spawned and scheduled but the sub-tasks it spawned in turn were not running
  • in fact two of the four cores were stopped, so they were ghosting the tasks randomly assigned to them...
  • they were stopped because they tried to broadcast an interrupt to other cores, which is an NMI on x86 but I had implemented it as a regular interrupt on aarch64, early in the port process, without leaving a clear TODO/warning... big mistake: the two missing cores were silently clashing on this
  • I reimplemented this using fast interrupts, which highlighted some other bugs but ultimately fixed the issue

This brings the following changes:

  • The interrupt controller API only contains cross-platform features, other features are not part of the trait. This made it easier to support FIQs on aarch64, but it could become a separate trait: Aarch64LocalInterruptController.
  • Aarch64-specific methods of the LocalInterruptController allow configuring/handling fast interrupts (FIQs).
  • The interrupt controller now allows enabling/disabling SPIs too (this was missing).
  • The GIC driver was updated with Group 0 (used by FIQs) interrupt support.
  • A bug in memory_aarch64 was fixed (a shift operation was missing).
  • A silent bug recently introduced in context_switch_regular was fixed (incorrect structure size assumption, fixed by padding the structure, which makes the assembly and memory layout easier to understand).
  • The interrupts crate was modified to handle FIQs.
  • The TLB Shootdown crate now uses FIQs on aarch64 as an NMI alternative.
  • nano_core, captain and ap_start were modified to set FIQ masks correctly.

This depend on a PR in irq_safety, so it doesn't build at this moment.

Copy link
Member

@kevinaboos kevinaboos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dealing with FIQs is kind of annoying. I'm wondering if we should just create a separate/dedicated struct or series of functions for handling FIQs that doesn't go through the same LocalInterruptController API that regular IRQs do.

kernel/context_switch_regular/src/aarch64.rs Outdated Show resolved Hide resolved
kernel/gic/src/gic/cpu_interface_gicv2.rs Outdated Show resolved Hide resolved
kernel/gic/src/gic/cpu_interface_gicv3.rs Outdated Show resolved Hide resolved
kernel/gic/src/gic/mod.rs Outdated Show resolved Hide resolved
kernel/gic/src/gic/mod.rs Show resolved Hide resolved
kernel/interrupts/src/aarch64/table.s Show resolved Hide resolved
kernel/memory_aarch64/src/lib.rs Outdated Show resolved Hide resolved
kernel/memory_aarch64/src/lib.rs Outdated Show resolved Hide resolved
kernel/memory_aarch64/src/lib.rs Outdated Show resolved Hide resolved
kernel/tlb_shootdown/src/lib.rs Outdated Show resolved Hide resolved
@NathanRoyer
Copy link
Member Author

Dealing with FIQs is kind of annoying. I'm wondering if we should just create a separate/dedicated struct or series of functions for handling FIQs that doesn't go through the same LocalInterruptController API that regular IRQs do.

If there was only GICv3 we could, but as GICv2 uses MMIO for FIQ management I think we should keep this "Use the normal structures but bypass their sync primitives" design: it's pretty simple to understand, unsafe code is very small, and it retains GIC compatibility via the GICv2/GICv3 abstraction.

@kevinaboos
Copy link
Member

Dealing with FIQs is kind of annoying. I'm wondering if we should just create a separate/dedicated struct or series of functions for handling FIQs that doesn't go through the same LocalInterruptController API that regular IRQs do.

If there was only GICv3 we could, but as GICv2 uses MMIO for FIQ management I think we should keep this "Use the normal structures but bypass their sync primitives" design: it's pretty simple to understand, unsafe code is very small, and it retains GIC compatibility via the GICv2/GICv3 abstraction.

Fair enough, I agree.

Left a few new comments, should be easy to address pretty quickly.

Copy link
Member

@kevinaboos kevinaboos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've accepted this PR, thanks! A couple of follow-up changes I'd like to see:

  • InterruptDestination seems redundant with IpiTargetCpu...?

  • We can use type wrappers to ensure that the acknowledge_fast_interrupt and end_of_fast_interrupt functions are only invokable by current_elx_fiq. By that I mean create a new type ExceptionContextFiq that is a simple wrapper around ExceptionContext, and then require a reference to that type to be passed into both of the above fast interrupt functions. Then, we can make those functions safe on the outside, and just use unsafe on the inside.

    #[transparent]
    #[non_exhaustive]
    pub struct ExceptionContextFiq(ExceptionContext);

    This design is better imo since there's no real way to ensure that the fast interrupt functions are actually called safely.

  • It's a little bit odd to use the type *const InterruptHandler, since InterruptHandler is already a function pointer. We should be able to remove the leading *const part and just deal with the InterruptHandler directly. But do let me know if that is actually requried somewhere else.

@kevinaboos kevinaboos changed the title Fix TLB Shootdown issue aarch64: support FIQs and use them for TLB shootdown IPIs Sep 24, 2023
@kevinaboos kevinaboos merged commit ac51fc1 into theseus-os:theseus_main Sep 24, 2023
3 checks passed
github-actions bot pushed a commit that referenced this pull request Sep 24, 2023
* Add support for fast interrupts on aarch64, aka FIQs.
  FIQs are designed to be fast and can thus interrupt regular interrupts (IRQs)
  that are in the process of being handled. They are similar to NMIs
  on x86_64 in this regard, but can also be explicitly enabled/disabled.
  * Updated the GIC driver to support both Group 0 (FIQs) and Group 1 (IRQs).
  * `nano_core`, `captain`, and `ap_start` now enable/disable FIQs.

* Broadcasting TLB shootdown IPIs now uses FIQs to ensure that
  TLB shootdowns occur instantly even if regular interrupts are disabled
  on one or more other CPUs.

* Add a separate trait `Aarch64LocalInterruptController` for arch-specific
  features, which keeps the `LocalInterruptController` arch-agnostic.
  * This trait is primarily for configuring/handling fast interrupts (FIQs),
    but also for acknowledging interrupts, which x86_64 does not require.
  * The interrupt controller now allows enabling/disabling SPIs too.

---------

Co-authored-by: Kevin Boos <kevinaboos@gmail.com> ac51fc1
github-actions bot pushed a commit to tsoutsman/Theseus that referenced this pull request Sep 25, 2023
…#1039)

* Add support for fast interrupts on aarch64, aka FIQs.
  FIQs are designed to be fast and can thus interrupt regular interrupts (IRQs)
  that are in the process of being handled. They are similar to NMIs
  on x86_64 in this regard, but can also be explicitly enabled/disabled.
  * Updated the GIC driver to support both Group 0 (FIQs) and Group 1 (IRQs).
  * `nano_core`, `captain`, and `ap_start` now enable/disable FIQs.

* Broadcasting TLB shootdown IPIs now uses FIQs to ensure that
  TLB shootdowns occur instantly even if regular interrupts are disabled
  on one or more other CPUs.

* Add a separate trait `Aarch64LocalInterruptController` for arch-specific
  features, which keeps the `LocalInterruptController` arch-agnostic.
  * This trait is primarily for configuring/handling fast interrupts (FIQs),
    but also for acknowledging interrupts, which x86_64 does not require.
  * The interrupt controller now allows enabling/disabling SPIs too.

---------

Co-authored-by: Kevin Boos <kevinaboos@gmail.com> ac51fc1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants