Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

too much time cost for compute FFT on Esp32-C3 #77

Open
blacknull opened this issue Feb 22, 2023 · 7 comments
Open

too much time cost for compute FFT on Esp32-C3 #77

blacknull opened this issue Feb 22, 2023 · 7 comments

Comments

@blacknull
Copy link

Here's my code for fft, with 256 samples, 8000Hz sampling frequence

`
unsigned long timeLoop = 0;
void loop() {
mbegin = micros();

// Compute FFT
FFT.DCRemoval();
FFT.Windowing(FFT_WIN_TYP_HAMMING, FFT_FORWARD);
FFT.Compute(FFT_FORWARD);
FFT.ComplexToMagnitude();

timeLoop = (micros() - mbegin + timeLoop) / 2; // moving average value
if (countLoop % 100 == 0) {
USB_SERIAL.println("runFFT cost: " + String(timeLoop) + " micro seconds");
}
}
`

it works fine on esp8266(160MHz), cost 20ms and 10ms on esp32.
but on esp32-c3:
runFFT cost: 115538 micro seconds
runFFT cost: 115510 micro seconds
runFFT cost: 115474 micro seconds
runFFT cost: 115286 micro seconds
runFFT cost: 114535 micro seconds

that's unacceptable...
the code is the same, don't know what's wrong with it, anybody can help?
thanks.

@kosme
Copy link
Owner

kosme commented Feb 22, 2023 via email

@blacknull
Copy link
Author

thanks for your reply.
I'm sorry for not make myself clear.
the code works fine on esp8266 and esp32.
but on esp32-c3 which has a risc-v 160Mhz cpu, is 10 times slow than esp32.
I switched the cpu from 160MHz to 120Mhz and 80Mhz, nothing going better but worse.
the function FFT.Compute(FFT_FORWARD) cost most of time.
When I looked into the code in FFT.Compute(), it seemed only sqrt() has heavy duty.
so I run a test for 100 times sqrt(), it took 615us only, not guilty.
then what's going wrong?
thanks.

@HorstBaerbel
Copy link

Try the develop branch.

@kosme
Copy link
Owner

kosme commented Jul 25, 2023

@blacknull Is the issue still present on the newer versions?

@kosme
Copy link
Owner

kosme commented Dec 28, 2023

Since you mention that other esp32 boards work correctly, I would think that this is likely a hardware/esp core problem. As I asked previously, is the issue still present?

@softhack007
Copy link

softhack007 commented Jan 11, 2024

on esp32-c3 which has a risc-v 160Mhz cpu, is 10 times slow than esp32.

@blacknull I think what you see is normal behaviour, as -C3 is very slow especially when doing floating point math.

We are using ArduinoFFT inside WLED audioreactive; our performance comparisons point into the same direction as what you observed.

Our explanation is: esp32 (the "classic" one) and ESP32-S3 both have an FPU (floating point hardware acceleration), while ESP32-S2 and esp32-C3 lack FPU and have to use a software emulation which is slow.

The performance drop between esp32 -> esp32-S2 is about 6x, and another 2x drop when using ESP32-C3.

Btw, we are using the develop branch which allows us to perform everything in float instead of double -> 8x faster!

For a forward FFT with 512 samples, we typically see these execution times:

  • classic esp32, and esp32-S3: 2-3ms
  • esp32-S2: 12-16ms
  • esp32-c3: 26-32ms

@robertlipe
Copy link

Drive-by comment, confirming Softhack007's comment.

Neither S2 nor C3 have hardware FPU. Citing
https://docs.espressif.com/projects/esp-idf/en/stable/esp32c3/api-guides/performance/speed.html
"Avoid using floating point arithmetic float. On ESP32-C3 these calculations are emulated in software and are very slow."
S2 has the same sentences at
https://docs.espressif.com/projects/esp-idf/en/v5.4/esp32s2/api-guides/performance/speed.html

I'd suggest closing this, as there's little that this library can do to change that. S3 boards are like $4 and their FPU performance is awesome. For the modern RISC-V parts, you have to go up the P4 (not generally available in mass quantities) to regain the single-precision FPU that the 2016 generation of Xtensa parts had. It's a pretty dumb decision on Espressif's part, but it's unlikely that any reader here buys enough parts to get Espressif to change their mind and build a dual-core RISC-V unit with radios and FPU for $1.50. :-)

Sorry, Blacknull. It sounds like you're just using the wrong parts for whatever you're doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants