Fix pitch detection and DSP performance issues#8
Conversation
- Sum L+R to mono for pitch analysis instead of channel 0 only - Add input decimation (2x at <=50kHz, 4x above) to reduce YIN CPU cost - Skip pitch detection entirely in manual mode - Gate wet signal until first valid pitch lock to avoid default 440Hz - Make smoothing alpha sample-rate independent via time constant
user1303836
left a comment
There was a problem hiding this comment.
PR #8 Review: Fix pitch detection and DSP performance issues
Verdict: Looks good. All 5 issues from #7 are correctly addressed. No blockers.
This PR addresses the performance and correctness issues identified in issue #7 with clean, focused changes across 4 files. Each fix is well-scoped and doesn't introduce unnecessary complexity.
Issue-by-issue assessment
| Issue | Fix | Assessment |
|---|---|---|
| 1. Mono sum for pitch analysis | Average L+R channels | Correct. Handles mono (1 channel) gracefully with the numChannels > 1 guard. |
| 2. YIN CPU reduction | Decimation (2x at <=50kHz, 4x above) | Correct. Reduces analysis cost significantly while keeping the same 2048-sample window at the reduced rate. |
| 3. Skip pitch detection in manual mode | feedSample() moved inside mode == 0 branch |
Correct. Zero CPU overhead in manual mode. Simple and clean. |
| 4. Gate wet signal before first valid pitch | hasValidPitch flag, reset in prepareToPlay |
Correct. Prevents unintended 440Hz ring mod before first lock. |
| 5. Sample-rate independent smoothing | alpha = 1 - exp(-1/(sr*tau)) with tau = amount * 200ms |
Correct. Standard one-pole time constant formula. Consistent behavior across sample rates. |
What's done well
pitchSmoother.prepare(sampleRate)is now called inprepareToPlay-- this was missing before and is needed for the new sample-rate dependent alpha- Channel pointer hoisting (from my PR #6 review suggestion) is incorporated
oscFreqinitialization to0.0f(from my PR #6 review suggestion) is incorporated via restructured flow- Atomic stores moved outside the per-sample loop (from my PR #6 review suggestion)
- The restructured
processBlockis cleaner than before -- mode branching is now at the right level
Minor observations (see inline comments)
| Priority | Item | Location |
|---|---|---|
| Low | Decimation without anti-alias filter -- probably fine for guitar, noted the tradeoff | YinPitchDetector.cpp:25 |
| Low | First sample after prepare() is always skipped due to pre-increment |
YinPitchDetector.cpp:25 |
| Low | channelReadPtrs[2] / channelPtrs[2] hardcoded to 2 -- safe given bus layout constraints |
PluginProcessor.cpp:113 |
Merge order
This is the only open PR. Targets main directly. Ready to merge.
|
|
||
| void YinPitchDetector::feedSample(float sample) | ||
| { | ||
| if (++decimationCounter < decimation) |
There was a problem hiding this comment.
Decimation without anti-alias filtering. The decimation skips every Nth sample but doesn't low-pass filter first. This means frequencies above analysisSR / 2 (e.g., above ~11kHz at 44.1kHz with decimation=2) will alias into the analysis window, potentially causing false pitch detections on harmonically rich signals (distorted guitar, for example).
In practice, the YIN algorithm is fairly robust against this because it works on autocorrelation rather than spectral peaks -- aliased energy doesn't create strong periodic patterns at the wrong period. And the 20-5000Hz frequency bounds filter would reject most garbage. So this is probably fine for guitar/bass signals.
But if you want to be thorough, a simple single-pole lowpass at analysisSR / 2 before decimation would eliminate the risk:
// In feedSample, before decimation:
decimationFilter = decimationFilter * 0.5f + sample * 0.5f; // crudeNot blocking -- the pragmatic approach here is valid given the use case.
|
|
||
| void YinPitchDetector::feedSample(float sample) | ||
| { | ||
| if (++decimationCounter < decimation) |
There was a problem hiding this comment.
decimationCounter is pre-incremented, so the first sample after prepare() is always skipped. After prepare() sets decimationCounter = 0, the first call to feedSample increments to 1, which is < decimation (2 or 4), so it returns early. The first sample that actually enters the buffer is sample #2 (or #4).
This is a minor off-by-one that doesn't affect correctness (skipping one extra sample at startup is inaudible), but if you want the first sample to always be processed, initialize decimationCounter = decimation - 1 in prepare(), or use post-increment logic:
buffer[...] = sample;
++writePos;
// ... existing logic ...
decimationCounter = 0; // at the endNot blocking.
| void analyse(); | ||
|
|
||
| double sr = 44100.0; | ||
| double analysisSR = 44100.0; |
There was a problem hiding this comment.
Good: analysisSR clearly communicates that this is the effective sample rate after decimation, not the host sample rate. The naming makes the freq = analysisSR / betterTau calculation self-documenting.
|
|
||
| auto* inputData = buffer.getReadPointer(0); | ||
|
|
||
| const float* channelReadPtrs[2] = {}; |
There was a problem hiding this comment.
Mono sum is correct but assumes max 2 channels. The channelReadPtrs[2] and channelPtrs[2] arrays are hardcoded to size 2. isBusesLayoutSupported only allows mono and stereo, so this is safe. But if bus layout support ever expands, this would silently overflow.
A defensive approach:
auto numChans = juce::jmin(numChannels, 2);before the loop, or use jassert(numChannels <= 2). Current code is fine given the constraints, just noting the assumption.
|
|
||
| pitchDetector.feedSample(monoSample); | ||
|
|
||
| auto result = pitchDetector.getResult(); |
There was a problem hiding this comment.
Good fix. The hasValidPitch gate prevents ring modulation at the oscillator's default 440Hz before the detector has locked onto a real pitch. Once a valid pitch is found, the flag stays true and "hold-last-good" behavior takes over via the smoother.
One edge case: if the user switches from Manual mode (where hasValidPitch is irrelevant) to Pitch Track mode mid-session, hasValidPitch might still be true from a previous pitch track session. This is actually correct behavior -- the oscillator holds whatever frequency it last had, and the detector will update it as soon as it gets a new lock. No issue here.
| @@ -30,6 +39,17 @@ class PitchSmoother | |||
| } | |||
|
|
|||
| private: | |||
There was a problem hiding this comment.
Good: sample-rate independent smoothing. The formula alpha = 1 - exp(-1 / (sr * tau)) with tau = amount * 0.2f (200ms max) is the standard one-pole time constant approach. At amount = 0, tau < 1e-6f triggers alpha = 1.0f (no smoothing, instant tracking). At amount = 1.0, tau = 200ms, which gives very smooth tracking. The behavior is now consistent across sample rates.
The 1e-6f guard prevents division by zero / extreme alpha values. Good.
One thought: recomputeAlpha() is called from both prepare() and setSmoothingAmount(). Since setSmoothingAmount is called every processBlock, and prepare is called rarely, this is fine -- the exp call per block is negligible.
| } | ||
|
|
||
| for (int i = 0; i < numSamples; ++i) | ||
| { |
There was a problem hiding this comment.
Issue 3 (skip pitch detection in manual mode) is correctly addressed. The pitchDetector.feedSample() call is now inside the if (mode == 0) branch, so manual mode incurs zero pitch detection overhead.
The issue description mentioned potentially keeping tracking "warm" in manual mode with reduced cost, but fully skipping is the simpler and cleaner approach. If the user switches back to pitch track mode, the detector just needs to fill its buffer (~46ms at 44.1kHz after decimation) before delivering a result -- a barely perceptible delay.
Addresses all 5 issues from #7.
Changes
Issue 1 - Mono sum for pitch analysis: Averages L+R channels instead of using channel 0 only. Prevents wrong tracking when the right channel carries the dominant signal or stereo content is phase-skewed.
Issue 2 - YIN CPU reduction via decimation: Feeds every 2nd sample (or every 4th at >50kHz) to the pitch detector. Reduces analysis cost by 2-8x depending on sample rate while maintaining the same 20-5000Hz detection range. Window size is now always 2048 at the decimated rate.
Issue 3 - Skip pitch detection in manual mode:
feedSample()is no longer called when mode is Manual, eliminating unnecessary CPU usage.Issue 4 - Gate wet signal before first valid pitch: Added
hasValidPitchflag that prevents ring modulation at the default 440Hz before the detector has locked onto a real pitch. Once a valid pitch is detected, hold-last-good behavior continues as before.Issue 5 - Sample-rate independent smoothing: Smoothing alpha is now computed from a time constant (
tau = amount * 200ms) usingalpha = 1 - exp(-1/(sr*tau)). Behavior is consistent across 44.1k, 48k, 96k, etc.Closes #7