-
Notifications
You must be signed in to change notification settings - Fork 349
mixin-mixout: handle congestion corner case without assert #6900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand it correctly, that previous the assert() was checking, that in the current sink the number of frames still pending from previous copies wasn't "too large," specifically that it wasn't larger than the number of free frames over all sinks? Which is similar to saying, that no sink is more than half full (assuming they all have the same size buffers)? Now you replace that check with a different one, namely checking that the number of still pending frames isn't larger than what we want to copy now? Which might actually be even easier to hit - if your last transfer is small and you still have a few pending frames from the previous run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh Correct. "frames_to_copy" already taken into account as "sinks_free_frames = MIN(sinks_free_frames, free_frames - pending_frames);" is done earlier in the function.
So old check assumes "start_frame" cannot ever be larger than sinks_free_frames. This is not correct and you either have undefined behaviour (this currently happens) or assert is hit (when we enable CONFIG_ASSERT=y).
My modified code takes account both how much we have leftovers in source, and how much we have space in output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kv2019i maybe I misunderstand... Can it not be that start_frame == 10, frames_to_copy == 5 just because we only want to copy 5 frames, because source_avail_frames == 5 and there's plenty of free space in the buffer, but here you decide that this is an error condition and set frames_to_copy to 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh Boy, this took a long time. But thanks, you are right, my fix was not correct. Looking at this again, it seems the problem is with the assert and it's actually not needed. If "start_frame == 10" and "frames_to_copy == 5", the "sinks_free_frames" can still be "3", "5" or "10", or are valid values. The "start_frame" is specific to a single mixout buffer, while the other variables are calculcated as a minimum across all mixout buffers. So it's perfectly possible that a single mixout buffer has more already mixed data, than what can be written in this cycle by mixin.
So long story short, the assert is not correct. And as the code can handle this case gracefully, no actions are needed in its place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kv2019i mmh, almost. I think sinks_free_frames == 3 would be invalid in your above example, but that's ok because it cannot be that. IIUC in mixin copying method we look over all mixouts, connected to this mixin, check free space in each of them for us and select the minimum over them all so that we copy the same number of frames to all to make sure that the mixin itself has a consistent state - it has pushed out frames_to_copy frames to all mixouts and its number of available for copying frames is reduced by that number. But otherwise yes, that assertion does seem wrong to me too. Maybe it got here by confusion with https://github.com/thesofproject/sof/blob/a88ba51bb4c5cd6b7a984179d39d1b9110e70d94/src/audio/mixin_mixout/mixin_mixout.c#L368 where a similarly named free_frames variable means something different - it should include pending frames and therefore cannot be smaller than those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right @lyakh , so what I mean that if "sinks_free_frames == 3", the existing code witll truncate frames_to_copy so it will be limited to 3 as well (even if "source_avail_frames == 5").
So we are good to go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kv2019i well, yeah, we can commit this, but there's a bigger problem with mixin-mixout that I dug out while working on #6864 : #6864 (comment) Any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that #6864 is interesting. But let's continue over there. I'd rather get this merged soon so we can enable assert and gets the "gates up" in our CI.
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kv2019i do we need to pass anything up the stack for host to take corrective action ?
|
@lgirdwood wrote:
Not sure how to do this as the module really doesn't know why data flow is abnormal. The assert is mostly hit in pause-resume tests with lot of stream state transitions, so these are likely cases where the stream is anyways terminated. At least no failures are seen with this patch. At least this is better than running the process() with invalid parameters (start_frames > space in sinks). @ranj063 is working on the #6836 , but does not seem to touch this part specifically. |
|
@lgirdwood I need to spend more time and respond to @lyakh 's concern. Let's not merge before that. |
|
Can one of the admins verify this patch? |
18c0009 to
a88ba51
Compare
|
V2 (Feb 3rd):
|
If one mixout (A) is not able to process data, the 'sinks_free_frames' will be truncated compared other mixout buffers. When the mixin process is producing data to another mixout buffer (B), the number of already mixed frames may be larger than 'sinks_free_frames'. Further it is possible this mixout (B) has pending frames, leading to a larger 'start_frame' value. This is a valid and possible scenario, but will trigger assert(sinks_free_frames >= start_frame) in current code. As 'frames_to_copy' is already correctly calculated as limited to available free frames across all mixouts, no extra logic is needed when this condition happens. Mixin is not able to proceed until the congested mixout (A) has room again for further samples. Link: thesofproject#6896 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
a88ba51 to
5452607
Compare
|
V3 (Feb 7th):
|
If mixout is not able to process data, the "assert(sinks_free_frames >= start_frame)" condition is hit.
This is not correct usage of asserts as the condition may be hit due to performance reasons (and is frequently hit in pause-resume stress test with sof-test), and production builds will have asserts disabled, leading to undefined behaviour.
Instead of assert, add a check, emit a warning and truncate the amount of copied frames, to ensure deterministic behaviour in all conditions.
Link: #6896
Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com