Skip to content

Conversation

@kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented May 19, 2023

The dd->local_buffer cannot be assumed to be stable across calls to component copy function. Specifically in case DAI is connected to multiple sinks, it is possible that one of the sinks is disconnected while the DAI is still running. E.g. this happens with SOF sof-cavs-nocodec topology with a capture DAI feeding both a smart-amp (in playback pipeline) and a normal capture pipeline. If the playback pipeline with smart-amp is stopped, the dd->local_buffer will point to a freed buffer, leading to heap corruption.

Fix the issue by updating the local_buffer pointer at start of dai_copy().

Link: #7191

kv2019i added 2 commits May 23, 2023 12:54
The dd->local_buffer may be freed between calls to copy if
the peer module is unbound. This can only happen if the peer
module is on a different pipeline as runtime changes to
pipeline connections are not allowed while pipeline is active.
It is however allowed to connect the DAI to sink/sources
on another pipeline, so this scenario is still possible. This
issue only affects IPC4 builds using copier.

In current code, disconnection of local_buffer will lead
to heap corruption as dd->local_buffer is freed but DAI keeps
accessing the buffer in its copy calls.

Fix the issue by implementing an explicit DAI unbind and
setting dd->local_buffer to NULL in case the peer module is
disconnected.

This patch does not add support for dynamic reconfiguration
of the DAI to adjust converters and local_buffer to the new
connections. If capture DAI is set up with multiple sinks and
one of them is stopped, the DAI will stop processing data.

Link: thesofproject#7191
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
In IPC4 mode, the DAI supports writing data to multiple sinks.
It is possible one of the sinks to be on a different pipeline,
which can be disconnected while the DAI is still running.

Depending on the order of pipeline setup, the disconnected sink
might have been chosen as the local_buffer.

Handle this case by reconfiguring the local buffer after
disconnect in the copy function.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i kv2019i force-pushed the 202305-daizephyr-localbuffer-fix branch from c6867ee to 0e006ad Compare May 23, 2023 09:57
@kv2019i
Copy link
Collaborator Author

kv2019i commented May 23, 2023

V2:

  • implemented unbind and support for dynamic reconfiguration of local_buffer
  • based on @ranj063 's suggestion, also tried to remove local_buffer, but that kept leading to larger rework as local_buffer is also used in dai_zephyr_copy()
  • may conflict with copier: preparation for copier module adapter interface #7598 but we do need to fix this bug first

@kv2019i kv2019i requested review from btian1, lyakh and ranj063 May 23, 2023 10:00
@kv2019i
Copy link
Collaborator Author

kv2019i commented May 23, 2023

CI results for V2. Looking at the Intel internal CI system fail at https://sof-ci.01.org/sof-pr-viewer/#/build/PR7660/build12047489 -> not sure of rootcause yet. cAVS2.5 Intel tests look good (all pass), but one fail in https://sof-ci.01.org/sofpr/PR7660/build8330/devicetest . Latter seems a bit odd as there is no error in either dmesg or FW mtrace. arecord returns a non-zero status and multiple-pause-resume test fails, but this would seem like a different case.

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one minor potential improvement

if (!dd->process) {
comp_err(dev, "converter function NULL: local fmt %d dma fmt %d\n",
local_fmt, dma_fmt);
dd->local_buffer = NULL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a temporary variable in lines 1473-1480 above and only assign dd->local_buffer after this check?

Copy link
Collaborator Author

@kv2019i kv2019i May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh Took a look at this, but actually I'd prefer not to. We know at this point already that the old value of "dd->local_buffer" is invalid and will be soon freed from the heap, so we want to erase in any case. I.e. even if no converter is found, we want to reset dd->local_buffer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, maybe I didn't explain well. This function is only called if (!dd->local_buffer) right? Then it is assigned a nun-NULL value either in line 1474 or in line 1478 above. But then we check if we can find a conversion function and if not - set that pointer to NULL again. Whereas we could simply avoid setting it to something we cannot use anyway completely. That way at no point of time would it be set to something useless. I.e.

	struct comp_buffer *buf;
	...
	if (dev->direction == SOF_IPC_STREAM_PLAYBACK)
		buf = list_first_item(&dev->bsource_list,
						   struct comp_buffer,
						   sink_list);
	else
		buf = list_first_item(&dev->bsink_list,
						   struct comp_buffer,
						   source_list);

	local_buf = buffer_acquire(buf);
	local_fmt = audio_stream_get_frm_fmt(&local_buf->stream);
	buffer_release(local_buf);

	dd->process = pcm_get_conversion_function(local_fmt, dma_fmt);

	if (!dd->process)
		comp_err(dev, "converter function NULL: local fmt %d dma fmt %d\n",
			 local_fmt, dma_fmt);

	dd->local_buffer = buf;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, now I get your point @lyakh . Let me do a small fixup to this in separate PR given I got a green run on the CI and ok from reviewers and we can't of really need to have this fix asap in the tree.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> #7690

Copy link
Collaborator

@RanderWang RanderWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 24, 2023

V3:

  • rebase only, no changes to code

@kv2019i
Copy link
Collaborator Author

kv2019i commented May 25, 2023

CI good, only one known case fails in https://sof-ci.01.org/sofpr/PR7660/build8330/devicetest/index.html .

I'll proceed with merge but promise to fixup the local_buffer double-setting as per @lyakh 's comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants