Skip to content

Conversation

@sensei-hacker
Copy link
Member

@sensei-hacker sensei-hacker commented Feb 9, 2026

User description

Summary

  • Add timeout guards to unbounded busy-wait loops in the SD card SPI and SDIO drivers
  • A problematic SD card could cause sdcardSpi_deselect() to spin forever on busIsBusy(), freezing the entire FC since blackbox runs inline with the PID loop via processBlackbox() in fc_core.c
  • This is a safety-critical fix: peripheral failures must never take down the flight controller

Changes

sdcard_spi.csdcardSpi_deselect() (called ~15 times after every SPI transaction):

  • Replace unbounded while (busIsBusy) { __NOP(); } with 100K-iteration timeout (~4µs at 168MHz)
  • On timeout: increment failureCount, disable card at threshold (8), always release SPI CS line
  • Matches the existing timeout pattern in sdcardSpi_init() (line 864)
  • Transient recovery: failureCount resets to 0 on any successful read/write

sdmmc_sdio_f4xx.c — DMA disable loop:

  • Add 10K-iteration timeout to while (pDMA->CR & DMA_SxCR_EN) (~0.4µs)
  • DMA disable normally completes in a few bus cycles; subsequent SDIO hardware timeout catches failures

sdmmc_sdio_f4xx.c — FIFO read loops (3 locations: SD_HighSpeed, SD_GetCardStatus, SD_FindSCR):

  • Add SD_DATATIMEOUT software iteration counter as defense-in-depth behind hardware SDIO_STA_DTIMEOUT
  • These are init-only code paths (not in PID loop), so the large counter is acceptable

Test plan

  • Build for MATEKF405 (F4/SDIO target) — clean
  • Build for OMNIBUSF4V3 (F4/SPI-SD target) — clean
  • Hardware test with known-good SD card (regression)
  • Hardware test with problematic/removed SD card (timeout recovery)

PR Type

Bug fix


Description

  • Add timeout guards to unbounded busy-wait loops in SD card drivers

    • sdcard_spi.c: 100K-iteration timeout in sdcardSpi_deselect() with failure tracking
    • sdmmc_sdio_f4xx.c: 10K-iteration timeout for DMA disable loop
    • sdmmc_sdio_f4xx.c: Software timeout backstops in three FIFO read loops
  • Prevents flight controller lockup from problematic SD cards during blackbox operations


Diagram Walkthrough

flowchart LR
  A["Problematic SD Card"] -->|triggers unbounded loop| B["busIsBusy/DMA/FIFO loops"]
  B -->|without timeout| C["Flight Controller Lockup"]
  B -->|with timeout| D["Graceful Failure Handling"]
  D -->|increment failureCount| E["Disable Card at Threshold"]
  D -->|release resources| F["Continue Operation"]
Loading

File Walkthrough

Relevant files
Bug fix
sdcard_spi.c
Add timeout to SPI deselect busy-wait loop                             

src/main/drivers/sdcard/sdcard_spi.c

  • Replace unbounded while (busIsBusy) loop with 100K-iteration timeout
    (~4µs at 168MHz)
  • Increment failureCount on timeout and disable card when threshold (8)
    is reached
  • Always release SPI CS line via busDeselectDevice() after timeout
  • Matches existing timeout pattern from sdcardSpi_init()
+11/-1   
sdmmc_sdio_f4xx.c
Add timeouts to DMA and FIFO read loops                                   

src/main/drivers/sdcard/sdmmc_sdio_f4xx.c

  • Add 10K-iteration timeout to DMA disable loop in
    SD_StartBlockTransfert() (~0.4µs)
  • Add SD_DATATIMEOUT software iteration counter to three FIFO read
    loops: SD_HighSpeed(), SD_GetCardStatus(), and SD_FindSCR()
  • Software timeouts provide defense-in-depth behind hardware
    SDIO_STA_DTIMEOUT
  • Init-only code paths allow larger timeout counters without PID loop
    impact
+14/-1   

Add timeout guards to unbounded busy-wait loops in the SD card drivers.
A problematic SD card could cause sdcardSpi_deselect() to spin forever
on busIsBusy(), freezing the entire FC since blackbox runs inline with
the PID loop.

- sdcard_spi.c: Add 100K-iteration timeout to sdcardSpi_deselect(),
  matching the existing pattern in sdcardSpi_init(). On timeout,
  increment failureCount and disable card at threshold.
- sdmmc_sdio_f4xx.c: Add 10K-iteration timeout to DMA disable loop.
  Add SD_DATATIMEOUT software backstop to three FIFO read loops
  (SD_HighSpeed, SD_GetCardStatus, SD_FindSCR) as defense-in-depth
  behind the hardware SDIO_STA_DTIMEOUT.
@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Feb 9, 2026

PR Compliance Guide 🔍

All compliance sections have been disabled in the configurations.

If the DMA stream remains enabled after the timeout, set
TransferError and return early to avoid configuring a
still-active DMA stream.
@sensei-hacker sensei-hacker added this to the 9.1 milestone Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant