Skip to content

Conversation

@serhiy-katsyuba-intel
Copy link
Contributor

@serhiy-katsyuba-intel serhiy-katsyuba-intel commented May 15, 2025

The HDA DMA hardware keeps track of data in the DMA buffer using hardware Read and Write Position registers. The software uses the struct audio_stream with w_ptr and r_ptr for a similar purpose. If the software w_ptr and r_ptr do not point to the same location as the hardware Write and Read Position registers, the problem occurs.

Such desynchronization happens upon a pipeline reset. The dma_buffer is freed during a reset in dai-zephyr or host-zephyr and reallocated in prepare() after the pipeline resume. The reallocated dma_buffer has w_ptr and r_ptr set to NULL, while the hardware HDA DMA Read and Write Position registers retain their values.

If, for example, in the DAI playback case, the difference between the Write Position register and w_ptr is more than one period, the problem could easily go unnoticed as the hardware simply copies older data. In case where the difference between the Write Position and w_ptr is more than 0 but less than one period, the DMA copies part of the new data and part of the old data, resulting in glitches.

This fix ensures that the software w_ptr and r_ptr stay in sync with the hardware HDA DMA Write and Read Position registers.

@serhiy-katsyuba-intel serhiy-katsyuba-intel marked this pull request as draft May 16, 2025 08:10
@serhiy-katsyuba-intel
Copy link
Contributor Author

serhiy-katsyuba-intel commented May 16, 2025

Failures in CI related to MTL and TGL. Zephyr's GPDMA does not return write_position and read_position in dma_status :( Temporarily switching to draft.

Update: Now fixed with new version.

The HDA DMA hardware keeps track of data in the DMA buffer using hardware
Read and Write Position registers. The software uses the struct
audio_stream with w_ptr and r_ptr for a similar purpose. If the software
w_ptr and r_ptr do not point to the same location as the hardware Write
and Read Position registers, the problem occurs.

Such desynchronization happens upon a pipeline reset. The dma_buffer is
freed during a reset in dai-zephyr or host-zephyr and reallocated in
prepare() after the pipeline resume. The reallocated dma_buffer has w_ptr
and r_ptr set to NULL, while the hardware HDA DMA Read and Write Position
registers retain their values.

If, for example, in the DAI playback case, the difference between the
Write Position register and w_ptr is more than one period, the problem
could easily go unnoticed as the hardware simply copies older data. In
case where the difference between the Write Position and w_ptr is more
than 0 but less than one period, the DMA copies part of the new data and
part of the old data, resulting in glitches.

This fix ensures that the software w_ptr and r_ptr stay in sync with
the hardware HDA DMA Write and Read Position registers.

Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>
@serhiy-katsyuba-intel serhiy-katsyuba-intel changed the title zephyr-dma: Fix HW DMA position regs going out of sync with w_ptr/r_ptr. hda-dma: Fix HDA DMA position regs going out of sync with w_ptr/r_ptr May 16, 2025
@serhiy-katsyuba-intel serhiy-katsyuba-intel marked this pull request as ready for review May 16, 2025 12:26
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you see this in some existing test case? Do you think we should cherry-pick this to v2.13?

@serhiy-katsyuba-intel
Copy link
Contributor Author

Did you see this in some existing test case? Do you think we should cherry-pick this to v2.13?

Yes. Bug report: https://hsdes.intel.com/appstore/article/#/18041236600 (Sorry, it's Intel's internal link).
The topology is trivial and I reproduce the problem easily. Reproduction rate is about 15-20% depending on how big is the difference between software and HW positions. If it is more then one period the issue is hidden as output is still a clean sin wave, if difference is less then one period -- glitches containing two (or three) shifted sin waves are observed. The issue is related to HDA DMA. I see it on PTL, QA team says they also observe it on LNL.

@kv2019i kv2019i merged commit a84870f into thesofproject:main May 19, 2025
40 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants