Skip to content

Conversation

@kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Jan 2, 2023

If mixout is not able to process data, the "assert(sinks_free_frames >= start_frame)" condition is hit.

This is not correct usage of asserts as the condition may be hit due to performance reasons (and is frequently hit in pause-resume stress test with sof-test), and production builds will have asserts disabled, leading to undefined behaviour.

Instead of assert, add a check, emit a warning and truncate the amount of copied frames, to ensure deterministic behaviour in all conditions.

Link: #6896
Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it correctly, that previous the assert() was checking, that in the current sink the number of frames still pending from previous copies wasn't "too large," specifically that it wasn't larger than the number of free frames over all sinks? Which is similar to saying, that no sink is more than half full (assuming they all have the same size buffers)? Now you replace that check with a different one, namely checking that the number of still pending frames isn't larger than what we want to copy now? Which might actually be even easier to hit - if your last transfer is small and you still have a few pending frames from the previous run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh Correct. "frames_to_copy" already taken into account as "sinks_free_frames = MIN(sinks_free_frames, free_frames - pending_frames);" is done earlier in the function.

So old check assumes "start_frame" cannot ever be larger than sinks_free_frames. This is not correct and you either have undefined behaviour (this currently happens) or assert is hit (when we enable CONFIG_ASSERT=y).

My modified code takes account both how much we have leftovers in source, and how much we have space in output.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i maybe I misunderstand... Can it not be that start_frame == 10, frames_to_copy == 5 just because we only want to copy 5 frames, because source_avail_frames == 5 and there's plenty of free space in the buffer, but here you decide that this is an error condition and set frames_to_copy to 10?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh Boy, this took a long time. But thanks, you are right, my fix was not correct. Looking at this again, it seems the problem is with the assert and it's actually not needed. If "start_frame == 10" and "frames_to_copy == 5", the "sinks_free_frames" can still be "3", "5" or "10", or are valid values. The "start_frame" is specific to a single mixout buffer, while the other variables are calculcated as a minimum across all mixout buffers. So it's perfectly possible that a single mixout buffer has more already mixed data, than what can be written in this cycle by mixin.

So long story short, the assert is not correct. And as the code can handle this case gracefully, no actions are needed in its place.

Copy link
Collaborator

@lyakh lyakh Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i mmh, almost. I think sinks_free_frames == 3 would be invalid in your above example, but that's ok because it cannot be that. IIUC in mixin copying method we look over all mixouts, connected to this mixin, check free space in each of them for us and select the minimum over them all so that we copy the same number of frames to all to make sure that the mixin itself has a consistent state - it has pushed out frames_to_copy frames to all mixouts and its number of available for copying frames is reduced by that number. But otherwise yes, that assertion does seem wrong to me too. Maybe it got here by confusion with https://github.com/thesofproject/sof/blob/a88ba51bb4c5cd6b7a984179d39d1b9110e70d94/src/audio/mixin_mixout/mixin_mixout.c#L368 where a similarly named free_frames variable means something different - it should include pending frames and therefore cannot be smaller than those.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right @lyakh , so what I mean that if "sinks_free_frames == 3", the existing code witll truncate frames_to_copy so it will be limited to 3 as well (even if "source_avail_frames == 5").

So we are good to go?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i well, yeah, we can commit this, but there's a bigger problem with mixin-mixout that I dug out while working on #6864 : #6864 (comment) Any ideas?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that #6864 is interesting. But let's continue over there. I'd rather get this merged soon so we can enable assert and gets the "gates up" in our CI.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i do we need to pass anything up the stack for host to take corrective action ?

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 4, 2023

@lgirdwood wrote:

@kv2019i do we need to pass anything up the stack for host to take corrective action ?

Not sure how to do this as the module really doesn't know why data flow is abnormal. The assert is mostly hit in pause-resume tests with lot of stream state transitions, so these are likely cases where the stream is anyways terminated. At least no failures are seen with this patch. At least this is better than running the process() with invalid parameters (start_frames > space in sinks).

@ranj063 is working on the #6836 , but does not seem to touch this part specifically.

@lgirdwood
Copy link
Member

@lyakh good for you ?
@kv2019i do we need any other PRs to be merged before this will pass CI or is the failure unexpected ?

@kv2019i kv2019i changed the title mixin-mixout: handle congestion corner case without assert [DNM] mixin-mixout: handle congestion corner case without assert Jan 11, 2023
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 11, 2023

@lgirdwood I need to spend more time and respond to @lyakh 's concern. Let's not merge before that.

@gkbldcig
Copy link
Collaborator

Can one of the admins verify this patch?

@kv2019i kv2019i force-pushed the 202301-mixin-assert-fix branch from 18c0009 to a88ba51 Compare February 3, 2023 17:50
@kv2019i kv2019i changed the title [DNM] mixin-mixout: handle congestion corner case without assert mixin-mixout: handle congestion corner case without assert Feb 3, 2023
@kv2019i kv2019i requested a review from lyakh February 3, 2023 17:51
@kv2019i
Copy link
Collaborator Author

kv2019i commented Feb 3, 2023

V2 (Feb 3rd):

  • modified patch based on review feedback
  • it turns out the assert condition is in fact incorrect and can simply be removed

If one mixout (A) is not able to process data, the 'sinks_free_frames'
will be truncated compared other mixout buffers. When the mixin process
is producing data to another mixout buffer (B), the number of already
mixed frames may be larger than 'sinks_free_frames'. Further it is
possible this mixout (B) has pending frames, leading to a larger
'start_frame' value.

This is a valid and possible scenario, but will trigger
  assert(sinks_free_frames >= start_frame)
in current code.

As 'frames_to_copy' is already correctly calculated as limited to
available free frames across all mixouts, no extra logic is needed when
this condition happens. Mixin is not able to proceed until the congested
mixout (A) has room again for further samples.

Link: thesofproject#6896
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i kv2019i force-pushed the 202301-mixin-assert-fix branch from a88ba51 to 5452607 Compare February 7, 2023 11:47
@kv2019i
Copy link
Collaborator Author

kv2019i commented Feb 7, 2023

V3 (Feb 7th):

  • no functional changes, just plain rebase

@kv2019i kv2019i merged commit ad9f659 into thesofproject:main Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants