-
Notifications
You must be signed in to change notification settings - Fork 349
dai-zephyr: fix use-after-free in dai copy #7660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dai-zephyr: fix use-after-free in dai copy #7660
Conversation
The dd->local_buffer may be freed between calls to copy if the peer module is unbound. This can only happen if the peer module is on a different pipeline as runtime changes to pipeline connections are not allowed while pipeline is active. It is however allowed to connect the DAI to sink/sources on another pipeline, so this scenario is still possible. This issue only affects IPC4 builds using copier. In current code, disconnection of local_buffer will lead to heap corruption as dd->local_buffer is freed but DAI keeps accessing the buffer in its copy calls. Fix the issue by implementing an explicit DAI unbind and setting dd->local_buffer to NULL in case the peer module is disconnected. This patch does not add support for dynamic reconfiguration of the DAI to adjust converters and local_buffer to the new connections. If capture DAI is set up with multiple sinks and one of them is stopped, the DAI will stop processing data. Link: thesofproject#7191 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
In IPC4 mode, the DAI supports writing data to multiple sinks. It is possible one of the sinks to be on a different pipeline, which can be disconnected while the DAI is still running. Depending on the order of pipeline setup, the disconnected sink might have been chosen as the local_buffer. Handle this case by reconfiguring the local buffer after disconnect in the copy function. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
c6867ee to
0e006ad
Compare
|
V2:
|
|
CI results for V2. Looking at the Intel internal CI system fail at https://sof-ci.01.org/sof-pr-viewer/#/build/PR7660/build12047489 -> not sure of rootcause yet. cAVS2.5 Intel tests look good (all pass), but one fail in https://sof-ci.01.org/sofpr/PR7660/build8330/devicetest . Latter seems a bit odd as there is no error in either dmesg or FW mtrace. arecord returns a non-zero status and multiple-pause-resume test fails, but this would seem like a different case. |
lyakh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just one minor potential improvement
| if (!dd->process) { | ||
| comp_err(dev, "converter function NULL: local fmt %d dma fmt %d\n", | ||
| local_fmt, dma_fmt); | ||
| dd->local_buffer = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use a temporary variable in lines 1473-1480 above and only assign dd->local_buffer after this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh Took a look at this, but actually I'd prefer not to. We know at this point already that the old value of "dd->local_buffer" is invalid and will be soon freed from the heap, so we want to erase in any case. I.e. even if no converter is found, we want to reset dd->local_buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, maybe I didn't explain well. This function is only called if (!dd->local_buffer) right? Then it is assigned a nun-NULL value either in line 1474 or in line 1478 above. But then we check if we can find a conversion function and if not - set that pointer to NULL again. Whereas we could simply avoid setting it to something we cannot use anyway completely. That way at no point of time would it be set to something useless. I.e.
struct comp_buffer *buf;
...
if (dev->direction == SOF_IPC_STREAM_PLAYBACK)
buf = list_first_item(&dev->bsource_list,
struct comp_buffer,
sink_list);
else
buf = list_first_item(&dev->bsink_list,
struct comp_buffer,
source_list);
local_buf = buffer_acquire(buf);
local_fmt = audio_stream_get_frm_fmt(&local_buf->stream);
buffer_release(local_buf);
dd->process = pcm_get_conversion_function(local_fmt, dma_fmt);
if (!dd->process)
comp_err(dev, "converter function NULL: local fmt %d dma fmt %d\n",
local_fmt, dma_fmt);
dd->local_buffer = buf;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, now I get your point @lyakh . Let me do a small fixup to this in separate PR given I got a green run on the CI and ok from reviewers and we can't of really need to have this fix asap in the tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> #7690
RanderWang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
V3:
|
|
CI good, only one known case fails in https://sof-ci.01.org/sofpr/PR7660/build8330/devicetest/index.html . I'll proceed with merge but promise to fixup the local_buffer double-setting as per @lyakh 's comment. |
The dd->local_buffer cannot be assumed to be stable across calls to component copy function. Specifically in case DAI is connected to multiple sinks, it is possible that one of the sinks is disconnected while the DAI is still running. E.g. this happens with SOF sof-cavs-nocodec topology with a capture DAI feeding both a smart-amp (in playback pipeline) and a normal capture pipeline. If the playback pipeline with smart-amp is stopped, the dd->local_buffer will point to a freed buffer, leading to heap corruption.
Fix the issue by updating the local_buffer pointer at start of dai_copy().
Link: #7191