-
Notifications
You must be signed in to change notification settings - Fork 349
Audio: EQ-FIR: Optimize the eq_fir_2x_sx functions #7499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
SOFCI TEST |
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats the before/after MCPS ?
src/audio/eq_fir/eq_fir_hifi3.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this processing is still based on channel data, i.e, try to handle one channel data in this line 123 for loop, then handle another channel, so:
- Possible handle data based on sample? take 2 channel as example.
handle L1R1L2R2
current L1 L2 --> R1 R2
proposed: L1R1L2R2 --> L3R3L4R4.
you can still handle 2 samples for each channel, this will make load/store fully aligned with 64 bit operation.
another disadvantage with current handle is: once the buffer is big, take deep buffer for example.
the cache miss will cost much more cycles than real, image big buffer have to write back and load new data in cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each channel can have its own coefficient, and it's delay depends on the statistics in this channel, processing by sample will cost much more cycles.
src/audio/eq_fir/eq_fir_hifi3.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worrying about change of processing order. A run in real device with long FIR would give more confidence that this is better. Maybe also xt testbench could be enough because it has pipelines and scheduler framework from SOF. You could add a test response where left and right channel have different coefficients and make them long. See the FIR EQ examples generator how it's done.
src/include/sof/math/fir_config.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I didn't realize that HIFI4 build does not have HIFI3 set while they are mostly compatible. I wonder if there's other similar issues in code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually in our config, we enabled HIFI3 while HIFI4 build. Add || XCHAL_HAVE_HIFI4 == 1 is because I'm not sure if the HIFI3 will still enabled while we have HIFI5 or other newer version(but might HIFI4 is enabled). I can check the other code.
|
I got with TGL HiFi3 build of xt testbench these MCPS numbers. It includes overhead of scheduler and pipeline, but excludes file components. Command is "process_test('eq-fir', 32, 32, 48000, 0, 'xt-run');". |
singalsu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. The performance seems to be similar as with previous patch version while this is quite different now.
|
@andrula-song Can you check the CI fails? Otherwise seems ready to go. |
Optimize the eq_fir_2x_sx functions by remove the duplicated usage of circular buffer. Signed-off-by: Andrula Song <andrula.song@intel.com>
|
Mandatory CIs pass, known fails in https://sof-ci.01.org/sofpr/PR7499/build8482/devicetest/index.html and https://sof-ci.01.org/sofpr/PR7499/build8481/devicetest/index.html. Fuzzer has issues now, please do @marc-hb @cujomalainey @andyross ping me when I should again start gating items for the fuzzer. |
Main branch is clean, still not clear (to me) what to do about stable-v2.2, see discussion in #7675 sof-docs also needs an update. |

Optimize the eq_fir_2x_sx functions by remove the duplicated usage of circular buffer.