Skip to content

Conversation

@kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Aug 13, 2025

Current code waits until host DMA has more than half of the DMA buffer size worth of data available for transfer. The code however does not check whether link DMA has space for all available data yet and can cause the link DMA write pointer to wrap the read pointer. This will break the delay reporting and can lead to link xruns if the wrapped write pointer ends up too close to the link DMA read position (which is moved by DMA hardware in playback case).

Tested-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com

Current code waits until host DMA has more than half of the DMA
buffer size worth of data available for transfer. The code however
does not check whether link DMA has space for all available data
yet and can cause the link DMA write pointer to wrap the read
pointer. This will break the delay reporting and can lead to link
xruns if the wrapped write pointer ends up too close to the link
DMA read position (which is moved by DMA hardware in playback case).

Tested-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Copilot AI review requested due to automatic review settings August 13, 2025 13:32
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find.

ret = dma_reload(cd->chan_link->dma->z_dev,
cd->chan_link->index, 0, 0,
half_buff_size);
MIN(host_avail_bytes, link_free_bytes));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is an "initial reload?" If this is the beginning of streaming than why wouldn't the "sink side" have enough space for the data - shouldn't it be empty at that time? And why isn't the same logic needed after the first transfer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh The dma_start() is called on link DMA, the buffer is full of data (link_avail_bytes=0, no room to write more). It is filled by zeroes when CHAIN_DMA IPC is sent by host to star the DMAs.

The same physical buffer is used for both host and link DMA, so we cannot call host reload, until link DMA has moved enough to make room.

Same check is done for following dma_reloads on L268-269 in this function. Only place where we had potentially to overwrite the link DMA pointer was in this initial write. It often didn't fail immediately, which made this harder to debug (as the error happened sometime later on).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i ah, so it's referring to the reloading after some space is freed in the initial silence-filled buffer? Ok, if that my understanding is correct, that makes sense

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh Ack, here's how it looks from DMA register dumps (playback, link dgbrp, host dgbwp):

buffer size 14208 (5ms buffer), link DMA reads from this, host DMA writes to it
0ms		   link read   640	   host write     0  # zeroes played out, no dma_reload() yet on host
1ms		   link read  3136 	   host write     0  # zeroes played out, no dma_reload() yet on host
2ms		   link read  6016 	   host write  7104  # bug hit, initial 7104 write too much!
3ms		   link read  8960	   host write  6016  # after xrun handling, write is back to ok distance
4ms 	   link read 11712	   host write  8896  # normal operation, still (should be) playing zeroes
5ms		   link read   256	   host write 11712  # link DMA wraps around, started playing initial real samples

@kv2019i
Copy link
Collaborator Author

kv2019i commented Aug 14, 2025

Test run with alsa_conformance_test (Intel internal test plan #56881 ) shows this PR has fixed all cases of:

[11511.341336] chain_dma: chain_task_run: dma_get_status() link xrun occurred, ret = -32
[11512.590170] chain_dma: chain_task_run: dma_get_status() link xrun occurred, ret = -32
[11512.591168] chain_dma: chain_task_run: dma_get_status() link xrun occurred, ret = -32

These were particularly easy to hit with 44.1 multiple HDMI sampling rates:

aplay -Dhw:0,5 -f S16_LE -c 8 -r44100 -d 30 -t raw /dev/random
aplay -Dhw:0,5 -f S16_LE -c 8 -r88200 -d 30 -t raw /dev/random
aplay -Dhw:0,5 -f S16_LE -c 8 -r176400 -d 30 -t raw /dev/random

@kv2019i kv2019i requested a review from abonislawski August 14, 2025 11:48
ret = dma_reload(cd->chan_link->dma->z_dev,
cd->chan_link->index, 0, 0,
half_buff_size);
MIN(host_avail_bytes, link_free_bytes));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i ah, so it's referring to the reloading after some space is freed in the initial silence-filled buffer? Ok, if that my understanding is correct, that makes sense

@kv2019i kv2019i merged commit 172589b into thesofproject:main Aug 15, 2025
39 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants