Skip to content

Conversation

Schuwi
Copy link

@Schuwi Schuwi commented Jul 20, 2025

Description

This PR fixes critical bugs in the dsps_fird_f32_aes3 optimized assembly implementation that were causing digital noise/corruption in the FIR filter output.

Issues Fixed:

  1. Incorrect register usage in offset_3 branch: The final accumulation was overwriting f4 instead of using f6, causing partial loss of filter results.

  2. Incorrect instruction ordering in non-offset_0 branches: The order of madd.s and EE.LDF.128.IP instructions in the tight FIR loops was leading to out-of-bounds data being added to FIR results..

  3. Potential out-of-bounds memory access: The optimized implementation could read 4 floats beyond the delay line buffer, which could crash if the buffer was placed at a memory boundary.

Root Cause:

The optimized AES3 implementation handles different memory alignments (offset_0, offset_1, offset_2, offset_3) but contained copy-paste errors and subtle timing issues in the non-aligned cases. The ANSI implementation worked correctly, but the optimized version would intermittently produce digital noise when the delay line pointer wasn't 16-byte aligned.

Changes Made:

  • Fixed register allocation bug in .offset_3 section (line 204: madd.s f4, f3, f14madd.s f6, f3, f14)
  • Corrected instruction ordering in .offset_1, .offset_2, and .offset_3 loops for proper coefficient-data alignment
  • Added documentation comments explaining the complex alignment handling logic
  • Cleaned up redundant register loads and improved code clarity

Documentation Update Needed:

The requirement for delay line buffer to be N + 4 floats (not just N) should be documented for dsps_fird_f32_init, as it's currently only mentioned for dsps_fir_f32_init.

Testing

  • Hardware tested: ESP32-S3 with PDM microphone processing at 48kHz with 64-tap FIR decimation filter
  • Verification method: Direct comparison between ANSI and AES3 implementations - both now produce audibly equivalent results
  • Test cases:
    • Different delay line alignments (offset_0, offset_2 branches confirmed working)
    • Real audio input

The fix has been tested in a real-world PDM audio processing application where the digital noise was clearly audible before the fix and completely eliminated after.


Checklist

Before submitting a Pull Request, please ensure the following:

  • 🚨 This PR does not introduce breaking changes.
  • All CI checks (GH Actions) pass.
  • Documentation is updated as needed.
  • Tests are updated or added as necessary.
  • Code is well-commented, especially in complex areas.
  • Git history is clean — commits are squashed to the minimum necessary.

@github-actions github-actions bot changed the title fix: bugs in dsps_fird_f32_aes3 fix: bugs in dsps_fird_f32_aes3 (DSP-167) Jul 20, 2025
@dmitry1945
Copy link
Collaborator

Hi @Schuwi, thank you very much,

I will merge next week.

Dmitry

@Kristian8606
Copy link

@dmitry1945 Does this apply to dsps_fir_f32_aes3 ? I'm using it to process a high-pass filter, today I updated esp-dsp 1.7.0 as well as made hardware changes and now I'm getting distorted sound and I'm not sure where the problem is.

@dmitry1945
Copy link
Collaborator

@Kristian8606 this update for fird only. Fir should work.
To be sure, just switch to the non optimized (ansi) version in menucunfig.

Dmitry

@dmitry1945
Copy link
Collaborator

@Schuwi
I will include your changes, but I can't include commit directly, it will be squashed.
And, I need to include you as file contributor. Please provide info that you would like to see as file contributor, like here:
https://github.com/espressif/esp-dsp/blob/master/modules/fft/float/dsps_fft4r_fc32_aes3_.S#L3

Thanks

@Schuwi
Copy link
Author

Schuwi commented Aug 13, 2025

@Schuwi I will include your changes, but I can't include commit directly, it will be squashed. And, I need to include you as file contributor. Please provide info that you would like to see as file contributor, like here: https://github.com/espressif/esp-dsp/blob/master/modules/fft/float/dsps_fft4r_fc32_aes3_.S#L3

Thanks

That's okay. As file contributor it is fine if you simply put Schuwi. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants