Skip to content

CPU optimization: background thread + decimation#31

Merged
user1303836 merged 1 commit intomainfrom
feature/cpu-optimization
Feb 17, 2026
Merged

CPU optimization: background thread + decimation#31
user1303836 merged 1 commit intomainfrom
feature/cpu-optimization

Conversation

@user1303836
Copy link
Owner

Summary

Reduces CPU usage from ~10% to <1% per plugin instance by addressing the root cause: 3x FFT-4096 running 331 times/sec on the audio thread.

Architecture change

  • Pitch analysis moved to a dedicated worker thread via lock-free SPSC FIFO (juce::AbstractFifo)
  • Audio thread per-sample work reduced to: 1 FIFO write + 2 atomic reads
  • Worker thread handles all FFT, YIN, and decimation work off the real-time path

Algorithmic optimizations

  • Hop size: 3ms -> 12ms (~4x fewer analyses/sec, from 331 to ~83)
  • 2x decimation: Halfband FIR filter downsamples 44.1kHz -> 22.05kHz before analysis, halving FFT size from 4096 to 2048
  • Wavetable sine: 2048-entry LUT with linear interpolation replaces std::sin() per sample

Code-level optimizations

  • Two-memcpy buffer linearization (eliminates per-element modulo in ring buffer read)
  • FloatVectorOperations::clear for SIMD-accelerated FFT buffer zeroing
  • Cached exp2 in PitchSmoother (recomputes only when smoothed value changes)
  • Modulo replaced with branch in ring buffer write

Files changed

File Change
source/dsp/HalfbandDecimator.h New: 7-tap halfband FIR decimation filter
source/dsp/YinPitchDetector.h FIFO, thread, atomic members; inline feedSample/getResult
source/dsp/YinPitchDetector.cpp AnalysisThread inner class, decimation, code-level fixes
source/dsp/PitchSmoother.h Cached exp2 result
source/dsp/Oscillator.cpp Wavetable sine lookup
tests/TestYinPitchDetector.cpp flushForTest() for async, updated timing helpers

PluginProcessor.cpp is unchanged -- the public API (prepare, feedSample, getResult) is preserved.

Test plan

  • All 34 tests pass (441,095 assertions)
  • Full plugin builds (VST3)
  • pluginval passes at strictness 5
  • CI passes on macOS and Windows
  • Manual test: confirm pitch tracking works identically in DAW

Move pitch analysis to a dedicated worker thread via SPSC FIFO
(juce::AbstractFifo). The audio thread now only writes samples to
the FIFO and reads two atomic floats, removing all FFT work from
the real-time path.

Additional optimizations applied:
- Increase hop size from 3ms to 12ms (~4x fewer analyses/sec)
- 2x decimation via halfband FIR filter (44.1kHz -> 22.05kHz),
  halving FFT size from 4096 to 2048
- Two-memcpy buffer linearization (eliminates per-element modulo)
- FloatVectorOperations::clear for FFT buffer zeroing
- Cached exp2 in PitchSmoother (recomputes only when smoothed changes)
- Wavetable sine oscillator (2048-entry LUT with linear interpolation)

Expected CPU reduction: ~10% -> <1% total system.
@user1303836 user1303836 merged commit 70978c3 into main Feb 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant