Skip to content

Conversation

@fredoh9
Copy link
Collaborator

@fredoh9 fredoh9 commented Mar 14, 2019

When CONFIG_SND_SOC_SOF_DEBUG_FORCE_IPC_POSITION is enabled, always provide
pcm status to alsa from IPC context. This will override no_period_wakeup
mode for debugging purpose.

This is the feature description in Kconfig

CONFIG_SND_SOC_SOF_DEBUG_FORCE_IPC_POSITION
This option force to handle stream position update IPCs and run pcm elapse to inform ALSA about that.

And this is original PR for this feature.
#123

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 14, 2019

fixes #616

@plbossart
Copy link
Member

Sorry, I don't think it makes sense to break the contract established with the application. If this field is set, we should disable the no-interrupt mode instead. the net result is the same but the application will not wonder why it receives periodic interrupts when it was agreed there would be no interrupts.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 14, 2019

@plbossart, OK, disabling no-interrupt mode makes sense and the net result is same with my proposal. Will re-visit my PR.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 15, 2019

for CI error, i don't see real error. Any way I need to revise the PR. Curious how I can reset CI and rerun.

The job exceeded the maximum time limit for jobs, and has been terminated.

@ranj063
Copy link
Collaborator

ranj063 commented Mar 15, 2019

@plbossart @fredoh9 just out of curiosity. how exactly do we disable no_period_wakeup? Would we set this during pcm_open? we seem to set the hw_params flag SNDRV_PCM_INFO_NO_PERIOD_WAKEUP today. Is this what should be dropped if FORCE_IPC_POSITION is enabled in kconfig?

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 15, 2019

I need to look at player and driver in both side. There is sw_params, like snd_pcm_sw_params_set_period_event() from user space. I saw we set no_period_wakeup of stream from hw_params. I will figure out soon.

@plbossart
Copy link
Member

You want to change the .info where it's advertised that the nowake-up is supported.

@plbossart
Copy link
Member

@plbossart
Copy link
Member

And probably rename the Kconfig option as _DISABLE_NOWAKEUP.

@fredoh9 fredoh9 force-pushed the fix/force_ipc_position_feature branch from deccf1e to 5c2ace0 Compare March 15, 2019 18:52
@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 15, 2019

I have no idea about Build failure.

sof-ci/jenkins/pr-build — Kernel Build failed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredoh9 this can be condensed a little bit. u can set runtime->hw_info to everything but SNDRV_PCM_INFO_NO_PERIOD_WAKEUP outside the if.
And in the if you can do runtime->hw.info |= SNDRV_PCM_INFO_NO_PERIOD_WAKEUP

Copy link
Collaborator

@ranj063 ranj063 Mar 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the commits should probably be squashed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063 Yes! thank you for good comment. Will do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredoh9
please remove it from outside if
#if !IS_ENABLE()
runtime->hw.info |= NO_PERIOD_WAKEUP
#endif

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, i missed this comment. I thought removing flag when the debugging feature is enabled also good idea. but setting and clearing in next line may cause confusing also. I agree. Will re-push after fixing this.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 16, 2019

Found a problem! Due to ci build error(I don't think this is real error), I built and tested again with/without SND_SOC_SOF_DEBUG_DISABLE_NOWAKEUP.
I removed this as I thought this is not required any more. Without this DSP send posn update every time. I need to restore that back.

 -	/* set host_period_bytes to 0 if no IPC position */	
 -	if (hda && hda->no_ipc_position)	
 -		ipc_params->host_period_bytes = 0;

@fredoh9 fredoh9 force-pushed the fix/force_ipc_position_feature branch from 5c2ace0 to 418d6e4 Compare March 16, 2019 01:26
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this patch directly, but I found this a bit confusing. Here ipc_nowakeup is set based on pcm_pointer() availability, but then e.g. in hda_dsp_pcm_pointer() 'ipc_nowakeup' is checked again and there are two codepaths. There seems to be duplicate logic in either of the places. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this feature is not clear and some part is confusing. I understand your point.
But as per your question, ipc_nowakeup is 0 when SND_SOC_SOF_DEBUG_DISABLE_NOWAKEUP is enabled. ipc_nowakeup in hda_dsp_pcm_pointer() has to be checked again.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise. Could you update the long help description as well?

@fredoh9 fredoh9 force-pushed the fix/force_ipc_position_feature branch from 418d6e4 to 46ff1e8 Compare March 18, 2019 18:12
@fredoh9 fredoh9 requested a review from kv2019i March 18, 2019 18:19
@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 18, 2019

@plbossart @ranj063 @keyonjie @kv2019i can you review again? If you have any comment let me know.

@fredoh9 fredoh9 force-pushed the fix/force_ipc_position_feature branch from 46ff1e8 to e88e1f2 Compare March 18, 2019 19:45
@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 18, 2019

Last change I made is in pcm.c to set hw.info. Condition is changed to "!IS_ENABLE()" per Ranj's comment.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Mar 19, 2019

CI error is timeout when it builds for SST

The job exceeded the maximum time limit for jobs, and has been terminated.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fredoh9 you've renamed things but kept a lot of existing logic that can be simplified further.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see here as well, we have a complicated mix of hda->ipc_nowakeup and stream->no_period_wakeup. This makes no sense, we should really get rid of all the hda->ipc_nowakeup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a conceptual issue now with having the hda structure carry information about the nowakeup when it should really be a stream operation. the rules should be:
a) the driver does not support nowakeup on any streams
b) the driver supports nowakeup on the streams requested by the application.
This structure based on hda->ipc_nowakeup makes no sense to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i replied right after your comment, But it showed pending and gone.
I'm reviewing this feature again. will come up with the points you brought.

kv2019i
kv2019i previously approved these changes Mar 19, 2019
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me now, thanks!

@kv2019i kv2019i dismissed their stale review March 19, 2019 16:06

Newbie mistake, Kconfig commit is good, I need more time to review the other one (after Pierre's comments).

@fredoh9 fredoh9 force-pushed the fix/force_ipc_position_feature branch 3 times, most recently from b356676 to b54bd53 Compare April 6, 2019 00:04
fredoh9 added 2 commits April 5, 2019 17:17
Setting a stream's info to SNDRV_PCM_INFO_NO_PERIOD_WAKEUP determines
whether the driver supports no period wakeups or not. An application can
request for no_period_wakeup by setting the stream's flags to
SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP.

In the SOF driver, when both the driver info and app's flags have
the NO_PERIOD_WAKEUP set, the host_period_bytes member of the stream
params ipc structure is set to 0 to indicate to the DSP to not send
period position updates. When either of these are not set, the DSP will be
forced to send periodic position updates.

For SKL+ platforms, no wakeup mode can be set by application. Period
position updates can be forced by setting
the CONFIG_SND_SOC_SOF_DEBUG_FORCE_ENABLE_WAKEUP if HDAC DPIB register
is not ready or posn debugging is required.

For SKL- platforms, DPIB is not available. The App need to enable
wakeup mode when period update is required. This feature helps to debug
the period update without help of app's flag.

Signed-off-by: Fred Oh <fred.oh@linux.intel.com>
SND_SOC_SOF_DEBUG_FORCE_IPC_POSITION didn't represent this feature well.
Rename to SND_SOC_SOF_DEBUG_FORCE_ENABLE_WAKEUP. If this debug feature is
set, it forcely enable wakeup mode. The dsp will send period position
update and the sof driver will invoke snd_pcm_period_elapse() at every
period data elapsed. Enable this when debug wakeup mode or DPIB is not
available.

Signed-off-by: Fred Oh <fred.oh@linux.intel.com>
@fredoh9 fredoh9 force-pushed the fix/force_ipc_position_feature branch from b54bd53 to f98cd03 Compare April 6, 2019 00:22
@keyonjie
Copy link

keyonjie commented Apr 8, 2019

I don't agree with the following:

  1. when apps run in irq mode, driver can choose to invoke snd_pcm_period_elapsed() at every period data elapsed, and, can choose to not invoke that also!
    when apps run in no_irq mode, driver should NOT invoke snd_pcm_period_elapsed() at every period data elapsed!

It should be:
when apps run in irq mode, driver SHALL invoke snd_pcm_period_elapsed() at every period data elapsed

I am not quite sure about this, that's why I listed with "per my understanding", I said that because I thought if we have .pointer() to provide position query at any time, maybe there are already enough information(without period interrupt) for ALSA to handle it correctly?

when apps run in no_irq mode, driver SHALL NOT invoke snd_pcm_period_elapsed() at every period data elapsed!

In addition whether we use no_irq mode or not, the application may request the position at any point of time using e.g. snd_pcm_status(). If the driver is only capable of providing a precise position at the end of a period then it SHALL use the INFO_BATCH flag.

Good to know about this, Thanks @plbossart

@keyonjie
Copy link

keyonjie commented Apr 8, 2019

It should be:
when apps run in irq mode, driver SHALL invoke snd_pcm_period_elapsed() at every period data elapsed
when apps run in no_irq mode, driver SHALL NOT invoke snd_pcm_period_elapsed() at every period data elapsed!

Ack on this. This is a key aspect of ALSA behaviour to the application. In no_irq, the application has some other means to wake-up the CPU and get scheduled (i.e. pulseaudio uses timers). In irq mode, the application is specifically asking ALSA to wake it up and get scheduled at given period intervals (e.g. JACK and many other apps like aplay depend on this).

Good to know, thanks for sharing @kv2019i

@keyonjie
Copy link

keyonjie commented Apr 8, 2019

no_ipc_position

Until here, the config item _FORCE_IPC_POSITION has no direct relationship with no_irq mode.

it does:

	ioc = hda->no_ipc_position ?
	      !stream->no_period_wakeup : 0;

In the case where we have hda->no_ipc_position == 0 and stream->no_period_wakeup==0, you will set ioc to 0, which means no period interrupt when the application expects one.

@plbossart we have 2 places for period interrupt(snd_pcm_period_elapse invoking), one is ioc and its interrupt handling in hda_dsp_stream_threaded_handler(), another one is ipc_period_elapsed(), here with your example, though ioc is set to 0 we don't have the former invoking, but we have the latter one as no_ipc_position is 0, so we still have period interrupt here to meet the application's expectation.

I admit here it is not good to provide position from DPIB but interrupt and elapse from position IPC, that's why I think item #2 is important(not implemented yet) in the to-do list:

2. change in FW side, always update stream position to stream_box(or dsp_box).
then the fallback pointer in the 1st item above can always get the stream position.

with this #2 implemented, we will have chooses for:

  1. position -- from mailbox or DPIB/posbuf;
  2. interrupt & elapse -- from position IPC or ioc.
    we can combine them together(something like "coupled mode"), e.g. _NO_DPIB_IOC, once item is equal to Y, we will use mailbox position and position IPC together, vice versa, use DPIB and ioc.

@keyonjie
Copy link

keyonjie commented Apr 8, 2019

/*

  • set IOC only when no_irq mode is chose, pcm elapse is selected, and position IPC is not used.
    */
    ioc = !stream->no_period_wakeup && !dev->no_pcm_elapse && hda->no_position_ipc;

NAK. The no_period_wakeup is a contract with the application that we SHALL NOT change because of debug options. The DMA interrupt needs to be completely and uniquely controlled by the stream->no_period_wakeup.

The only allowed path is to disable the no_period_wakeup capability by removing the INFO_NOWAKE_UP when the debug option is selected.

As @plbossart and @kv2019i shared above, no_pcm_elapse (the optional item #3 in my to-do list above) is no need, so it should be:

ioc = !stream->no_period_wakeup && hda->no_position_ipc;

@kv2019i
Copy link
Collaborator

kv2019i commented Apr 8, 2019

@keyonjie wrote:

As @plbossart and @kv2019i shared above, no_pcm_elapse (the optional item #3 in my to-do list above) is no need, so it should be:

ioc = !stream->no_period_wakeup && hda->no_position_ipc;

Yes, I think this is right. Thanks Keyon for explaining that with position updates enabled the ALSA elapsed is called from a different place than with IOC=1, so with that this sounds good. I would add a comment about this to the code though, because this is easy to interpret get wrong. But for actual logic, isn't the above equivalent to already in code:
'''
ioc = hda->no_ipc_position ? !stream->no_period_wakeup : 0;
''''

@plbossart
Copy link
Member

@keyonjie I have an alternate proposal. Let's just remove this debug functionality for now, and add a PR where you clearly indicate what you want to do. it feels like this capability was added without a clear intent, changes the contract with the application and assumes a behavior from firmware that's not present. That's just too many opens. It's not fair to ask @fredoh9 to fix something that's not properly defined.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Apr 8, 2019

@plbossart @keyonjie , i can remove this feature within this PR. I can keep all of my changes except the CONFIG part.

One behavior change with my change is this. no_ipc_position is 0 in SKL- platform no matter what.
hdev->no_ipc_position = sof_ops(sdev)->pcm_pointer ? 1 : 0;
but with this PR, it is replaced with 'no_period_wakeup' which is set by app's flags.
no_period_wakeup

It looks to me right change and right behavior. If no comment further, i can re-submit this PR after removing this debug feature.

@wenqingfu
Copy link

1. after DPIB and IOC enabled later, we have one more choice, and using DPIB/IOC is the preferred mode, so we introduced _FORCE_IPC_POSITION option for (IPC mode) debug purpose, and set it to default N.

After reading through the long conversation, I still don't understand why it's called a DEBUG thing in the first place anyway, i.e. what it tried to debug... Did you see wrong behavior of DPIB/IOC, and thus want to go back to old school IPC position update to debug or workaround it? If not, I feel it's an overcooked "feature". If so, why not just call it something like DISABLE_DPIB to make the logic straight?

@keyonjie
Copy link

keyonjie commented Apr 9, 2019

@keyonjie wrote:

As @plbossart and @kv2019i shared above, no_pcm_elapse (the optional item #3 in my to-do list above) is no need, so it should be:
ioc = !stream->no_period_wakeup && hda->no_position_ipc;

Yes, I think this is right. Thanks Keyon for explaining that with position updates enabled the ALSA elapsed is called from a different place than with IOC=1, so with that this sounds good. I would add a comment about this to the code though, because this is easy to interpret get wrong. But for actual logic, isn't the above equivalent to already in code:
'''
ioc = hda->no_ipc_position ? !stream->no_period_wakeup : 0;
''''

yes, the original logic here is correct.
since #3 in my to-do list is no need(then the same for #4), this simplify for us a lot, only #1 and #2 left.

@keyonjie
Copy link

keyonjie commented Apr 9, 2019

1. after DPIB and IOC enabled later, we have one more choice, and using DPIB/IOC is the preferred mode, so we introduced _FORCE_IPC_POSITION option for (IPC mode) debug purpose, and set it to default N.

After reading through the long conversation, I still don't understand why it's called a DEBUG thing in the first place anyway, i.e. what it tried to debug... Did you see wrong behavior of DPIB/IOC, and thus

position IPC was born earlier(in our byt time even before SOF's born), and on SOF SKL+, we implement DPIB/IOC quite late(#123, at Sep. 2018), so historically DPIB/IOC is a replacer of position IPC in SOF for SKL+, we took steady step at the beginning of the replacement happens with keeping this fallback debug solution as there might(and actually did) be regression with DPIB/IOC.

Though DPIB/IOC looks working quite well now so removing this flag is actually not big deal to me, considering that it will provide consistency for SKL+ platforms with SKL- ones(results on SKL- and SKL+ are both run with position IPC, more comparable results and values), I still hope we can keep it for providing one more choice. If we have issues with this fallback choice, we can reduce the priority of the issue, but should fix it.

want to go back to old school IPC position update to debug or workaround it? If not, I feel it's an overcooked "feature". If so, why not just call it something like DISABLE_DPIB to make the logic straight?

yes, this(renaming it) is actually the similar task in the 1st item of my to-do list:

1. add config item _USE_DIPB_POINTER(default Y), and cleanup the .pointer() callback
in sof_pcm_pointer():

@keyonjie
Copy link

keyonjie commented Apr 9, 2019

@keyonjie I have an alternate proposal. Let's just remove this debug functionality for now, and add a PR where you clearly indicate what you want to do. it feels like this capability was added without a clear intent, changes the contract with the application and assumes a behavior from firmware that's not

um, as I replied to @wenqingfu above, we did have intention for this debug functionality, DPIB/IOC is quite platform specific feature, and mailbox position is the generic solution for all platforms(even non-Intel ones), IMO, reserving this choice of the fallback generic solution on SKL+ platforms and make it works as it do on other platforms has no harm at least.

present. That's just too many opens. It's not fair to ask @fredoh9 to fix something that's not properly defined.

Removing the debug functionality actually might hide some potential issue, so I don't prefer that.

Now actually only 2 items left in to-do list:

  1. rename the config item to something like _DISABLE_DPIB(default N), and cleanup the .pointer() callback
    in sof_pcm_pointer():
    ++ #if !IS_ENABLED(CONFIG_SND_SOC_SOF_DISABLE_DPIB)
    /* if have dsp ops DPIB pointer callback, use that directly */
    if (sof_ops(sdev)->pcm_pointer) //Keyon: maybe better to rename 'pcm_pointer' to 'dpib_pointer'
    return sof_ops(sdev)->pcm_pointer(sdev, substream);
    ++ #endif
    and make sure when DPIB is disabled, don't use IOC and DPIB.

  2. change in FW side, always update stream position to stream_box(or dsp_box).
    then the fallback pointer in the 1st item above can always get the stream position.

The #2 is important, it could benefit to SKL- platform also.

@plbossart
Copy link
Member

plbossart commented Apr 9, 2019 via email

@keyonjie
Copy link

keyonjie commented Apr 10, 2019

On 4/8/19 9:05 PM, Keyon wrote: @keyonjie https://github.com/keyonjie I have an alternate proposal. Let's just remove this debug functionality for now, and add a PR where you clearly indicate what you want to do. it feels like this capability was added without a clear intent, changes the contract with the application and assumes a behavior from firmware that's not um, as I replied to @wenqingfu https://github.com/wenqingfu above, we did have intention for this debug functionality, DPIB/IOC is quite platform specific feature, and mailbox position is the generic solution for all platforms(even non-Intel ones), IMO, reserving this choice of the fallback generic solution on SKL+ platforms and make it works as it do on other platforms has no harm at least.
Actually we know that DPIB doesn't report the accurate position for capture...look at the hdac library, the position is read from a register.

Are you talking about posbuf? we already do it like that in our .pointer().

present. That's just too many opens. It's not fair to ask @fredoh9 https://github.com/fredoh9 to fix something that's not properly defined. Removing the debug functionality actually might hide some potential issue, so I don't prefer that.
we don't enable this debug functionality in products or even in daily builds so what's the issue?
Now actually only 2 items left in to-do list: 1. rename the config item to something like _DISABLE_DPIB(default N), and cleanup the .pointer() callback in sof_pcm_pointer(): ++ #if !IS_ENABLED(CONFIG_SND_SOC_SOF_DISABLE_DPIB) /* if have dsp ops DPIB pointer callback, use that directly */ if (sof_ops(sdev)->pcm_pointer) //Keyon: maybe better to rename 'pcm_pointer' to 'dpib_pointer' return sof_ops(sdev)->pcm_pointer(sdev, substream); ++ #endif and make sure when DPIB is disabled, don't use IOC and DPIB.
NO! IOC has nothing to do with DPIB. IOC should only be set based on the no_wakeup flag.

If no other solution existed, yes, what ALSA really care is pcm_elapse, not the IOC itself.

  1. change in FW side, always update stream position to stream_box(or dsp_box). then the fallback pointer in the 1st item above can always get the stream position. The Enable tone generator - Part 1 #2 <Enable tone generator - Part 1 #2> is important, it could benefit to SKL- platform also.
    You are asking for a firmware change but there is no one working on this, so your recommendation is not a working proposition. If there is no solution there is no problem. Let's remove this.

We can remove it, but please be noticed that the same problem may happen on SKL- platforms where we can use mailbox position only.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Apr 10, 2019

I am able to test this PR on BYT Minnowboard. Default defconfig is used.
NULL pointer dereference error found. I can take care of SKL- platform as well.

[ 110.031427] sof-audio-acpi 80860F28:01: ipc rx: 0x600a0001: GLB_STREAM_MSG: POSITION
[ 110.031460] sof-audio-acpi 80860F28:01: posn : host 0x3f00 dai 0x7980 wall 0x17b9a1
[ 110.031485] BUG: unable to handle kernel NULL pointer dereference at 000000000000007c
[ 110.031500] #PF error: [normal kernel read fault]

Update: attached full dmesg
after.log

@keyonjie
Copy link

keyonjie commented Apr 10, 2019

I am able to test this PR on BYT Minnowboard. Default defconfig is used.
NULL pointer dereference error found. I can take care of SKL- platform as well.

[ 110.031427] sof-audio-acpi 80860F28:01: ipc rx: 0x600a0001: GLB_STREAM_MSG: POSITION
[ 110.031460] sof-audio-acpi 80860F28:01: posn : host 0x3f00 dai 0x7980 wall 0x17b9a1
[ 110.031485] BUG: unable to handle kernel NULL pointer dereference at 000000000000007c
[ 110.031500] #PF error: [normal kernel read fault]

Good, @fredoh9 thanks for taking care of that also.

@fredoh9 fredoh9 changed the title ASoC: SOF: fix debug ipc position feature [DO NOT MERGE] ASoC: SOF: fix debug ipc position feature Apr 11, 2019
@fredoh9
Copy link
Collaborator Author

fredoh9 commented Apr 11, 2019

After discussing with reviewers, fixing position update is bigger task than initial scope of the fix. There are platform difference as well. The solution should work in all platforms and all directions (playback/capture). We should think about accuracy and latency also. Eventually we need to prepare the documents about this solution.

It requires much background work, hence I will put this [DO NOT MERGE] for now.

@plbossart plbossart added the Unclear No agreement on problem statement and resolution label Jun 11, 2019
dragosht pushed a commit to dragosht/linux-sof that referenced this pull request Oct 10, 2019
Recent changes that removed rtnl dependency from rules update path of tc
also made tcf_block_put() function sleeping. This function is called from
ops->destroy() of several Qdisc implementations, which in turn is called by
qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
which results sleeping-while-atomic BUG.

Steps to reproduce for htb:

tc qdisc add dev ens1f0 root handle 1: htb default 12
tc class add dev ens1f0 parent 1: classid 1:1 htb rate 100kbps ceil 100kbps
tc qdisc add dev ens1f0 parent 1:1 handle 40: sfq perturb 10
tc class add dev ens1f0 parent 1:1 classid 1:2 htb rate 100kbps ceil 100kbps

Resulting dmesg:

[ 4791.148551] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
[ 4791.151354] in_atomic(): 1, irqs_disabled(): 0, pid: 27273, name: tc
[ 4791.152805] INFO: lockdep is turned off.
[ 4791.153605] CPU: 19 PID: 27273 Comm: tc Tainted: G        W         5.3.0-rc8+ thesofproject#721
[ 4791.154336] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 4791.155075] Call Trace:
[ 4791.155803]  dump_stack+0x85/0xc0
[ 4791.156529]  ___might_sleep.cold+0xac/0xbc
[ 4791.157251]  __mutex_lock+0x5b/0x960
[ 4791.157966]  ? console_unlock+0x363/0x5d0
[ 4791.158676]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 4791.159395]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 4791.160103]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 4791.160815]  tcf_block_put_ext.part.0+0x21/0x50
[ 4791.161530]  tcf_block_put+0x50/0x70
[ 4791.162233]  sfq_destroy+0x15/0x50 [sch_sfq]
[ 4791.162936]  qdisc_destroy+0x5f/0x160
[ 4791.163642]  htb_change_class.cold+0x5df/0x69d [sch_htb]
[ 4791.164505]  tc_ctl_tclass+0x19d/0x480
[ 4791.165360]  rtnetlink_rcv_msg+0x170/0x4b0
[ 4791.166191]  ? netlink_deliver_tap+0x95/0x400
[ 4791.166907]  ? rtnl_dellink+0x2d0/0x2d0
[ 4791.167625]  netlink_rcv_skb+0x49/0x110
[ 4791.168345]  netlink_unicast+0x171/0x200
[ 4791.169058]  netlink_sendmsg+0x224/0x3f0
[ 4791.169771]  sock_sendmsg+0x5e/0x60
[ 4791.170475]  ___sys_sendmsg+0x2ae/0x330
[ 4791.171183]  ? ___sys_recvmsg+0x159/0x1f0
[ 4791.171894]  ? do_wp_page+0x9c/0x790
[ 4791.172595]  ? __handle_mm_fault+0xcd3/0x19e0
[ 4791.173309]  __sys_sendmsg+0x59/0xa0
[ 4791.174024]  do_syscall_64+0x5c/0xb0
[ 4791.174725]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 4791.175435] RIP: 0033:0x7f0aa41497b8
[ 4791.176129] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
4
[ 4791.177532] RSP: 002b:00007fff4e37d588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 4791.178243] RAX: ffffffffffffffda RBX: 000000005d8132f7 RCX: 00007f0aa41497b8
[ 4791.178947] RDX: 0000000000000000 RSI: 00007fff4e37d5f0 RDI: 0000000000000003
[ 4791.179662] RBP: 0000000000000000 R08: 0000000000000001 R09: 00000000020149a0
[ 4791.180382] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
[ 4791.181100] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000

In htb_change_class() function save parent->leaf.q to local temporary
variable and put reference to it after sch tree lock is released in order
not to call potentially sleeping cls API in atomic section. This is safe to
do because Qdisc has already been reset by qdisc_purge_queue() inside sch
tree lock critical section.

Fixes: c266f64 ("net: sched: protect block state with mutex")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dragosht pushed a commit to dragosht/linux-sof that referenced this pull request Oct 10, 2019
Recent changes that removed rtnl dependency from rules update path of tc
also made tcf_block_put() function sleeping. This function is called from
ops->destroy() of several Qdisc implementations, which in turn is called by
qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
which results sleeping-while-atomic BUG.

Steps to reproduce for multiq:

tc qdisc add dev ens1f0 root handle 1: multiq
tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10
ethtool -L ens1f0 combined 2
tc qdisc change dev ens1f0 root handle 1: multiq

Resulting dmesg:

[ 5539.419344] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
[ 5539.420945] in_atomic(): 1, irqs_disabled(): 0, pid: 27658, name: tc
[ 5539.422435] INFO: lockdep is turned off.
[ 5539.423904] CPU: 21 PID: 27658 Comm: tc Tainted: G        W         5.3.0-rc8+ thesofproject#721
[ 5539.425400] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 5539.426911] Call Trace:
[ 5539.428380]  dump_stack+0x85/0xc0
[ 5539.429823]  ___might_sleep.cold+0xac/0xbc
[ 5539.431262]  __mutex_lock+0x5b/0x960
[ 5539.432682]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 5539.434103]  ? __nla_validate_parse+0x51/0x840
[ 5539.435493]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 5539.436903]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 5539.438327]  tcf_block_put_ext.part.0+0x21/0x50
[ 5539.439752]  tcf_block_put+0x50/0x70
[ 5539.441165]  sfq_destroy+0x15/0x50 [sch_sfq]
[ 5539.442570]  qdisc_destroy+0x5f/0x160
[ 5539.444000]  multiq_tune+0x14a/0x420 [sch_multiq]
[ 5539.445421]  tc_modify_qdisc+0x324/0x840
[ 5539.446841]  rtnetlink_rcv_msg+0x170/0x4b0
[ 5539.448269]  ? netlink_deliver_tap+0x95/0x400
[ 5539.449691]  ? rtnl_dellink+0x2d0/0x2d0
[ 5539.451116]  netlink_rcv_skb+0x49/0x110
[ 5539.452522]  netlink_unicast+0x171/0x200
[ 5539.453914]  netlink_sendmsg+0x224/0x3f0
[ 5539.455304]  sock_sendmsg+0x5e/0x60
[ 5539.456686]  ___sys_sendmsg+0x2ae/0x330
[ 5539.458071]  ? ___sys_recvmsg+0x159/0x1f0
[ 5539.459461]  ? do_wp_page+0x9c/0x790
[ 5539.460846]  ? __handle_mm_fault+0xcd3/0x19e0
[ 5539.462263]  __sys_sendmsg+0x59/0xa0
[ 5539.463661]  do_syscall_64+0x5c/0xb0
[ 5539.465044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 5539.466454] RIP: 0033:0x7f1fe08177b8
[ 5539.467863] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
4
[ 5539.470906] RSP: 002b:00007ffe812de5d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 5539.472483] RAX: ffffffffffffffda RBX: 000000005d8135e3 RCX: 00007f1fe08177b8
[ 5539.474069] RDX: 0000000000000000 RSI: 00007ffe812de640 RDI: 0000000000000003
[ 5539.475655] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000182e9b0
[ 5539.477203] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
[ 5539.478699] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000

Rearrange locking in multiq_tune() in following ways:

- In loop that removes Qdiscs from disabled queues, call
  qdisc_purge_queue() instead of qdisc_tree_flush_backlog() on Qdisc that
  is being destroyed. Save the Qdisc in temporary allocated array and call
  qdisc_put() on each element of the array after sch tree lock is released.
  This is safe to do because Qdiscs have already been reset by
  qdisc_purge_queue() inside sch tree lock critical section.

- Do the same change for second loop that initializes Qdiscs for newly
  enabled queues in multiq_tune() function. Since sch tree lock is obtained
  and released on each iteration of this loop, just call qdisc_put()
  directly outside of critical section. Don't verify that old Qdisc is not
  noop_qdisc before releasing reference to it because such check is already
  performed by qdisc_put*() functions.

Fixes: c266f64 ("net: sched: protect block state with mutex")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dragosht pushed a commit to dragosht/linux-sof that referenced this pull request Oct 10, 2019
Recent changes that removed rtnl dependency from rules update path of tc
also made tcf_block_put() function sleeping. This function is called from
ops->destroy() of several Qdisc implementations, which in turn is called by
qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock,
which results sleeping-while-atomic BUG.

Steps to reproduce for sfb:

tc qdisc add dev ens1f0 handle 1: root sfb
tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10
tc qdisc change dev ens1f0 root handle 1: sfb

Resulting dmesg:

[ 7265.938717] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
[ 7265.940152] in_atomic(): 1, irqs_disabled(): 0, pid: 28579, name: tc
[ 7265.941455] INFO: lockdep is turned off.
[ 7265.942744] CPU: 11 PID: 28579 Comm: tc Tainted: G        W         5.3.0-rc8+ thesofproject#721
[ 7265.944065] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 7265.945396] Call Trace:
[ 7265.946709]  dump_stack+0x85/0xc0
[ 7265.947994]  ___might_sleep.cold+0xac/0xbc
[ 7265.949282]  __mutex_lock+0x5b/0x960
[ 7265.950543]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 7265.951803]  ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 7265.953022]  tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0
[ 7265.954248]  tcf_block_put_ext.part.0+0x21/0x50
[ 7265.955478]  tcf_block_put+0x50/0x70
[ 7265.956694]  sfq_destroy+0x15/0x50 [sch_sfq]
[ 7265.957898]  qdisc_destroy+0x5f/0x160
[ 7265.959099]  sfb_change+0x175/0x330 [sch_sfb]
[ 7265.960304]  tc_modify_qdisc+0x324/0x840
[ 7265.961503]  rtnetlink_rcv_msg+0x170/0x4b0
[ 7265.962692]  ? netlink_deliver_tap+0x95/0x400
[ 7265.963876]  ? rtnl_dellink+0x2d0/0x2d0
[ 7265.965064]  netlink_rcv_skb+0x49/0x110
[ 7265.966251]  netlink_unicast+0x171/0x200
[ 7265.967427]  netlink_sendmsg+0x224/0x3f0
[ 7265.968595]  sock_sendmsg+0x5e/0x60
[ 7265.969753]  ___sys_sendmsg+0x2ae/0x330
[ 7265.970916]  ? ___sys_recvmsg+0x159/0x1f0
[ 7265.972074]  ? do_wp_page+0x9c/0x790
[ 7265.973233]  ? __handle_mm_fault+0xcd3/0x19e0
[ 7265.974407]  __sys_sendmsg+0x59/0xa0
[ 7265.975591]  do_syscall_64+0x5c/0xb0
[ 7265.976753]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 7265.977938] RIP: 0033:0x7f229069f7b8
[ 7265.979117] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5
4
[ 7265.981681] RSP: 002b:00007ffd7ed2d158 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 7265.983001] RAX: ffffffffffffffda RBX: 000000005d813ca1 RCX: 00007f229069f7b8
[ 7265.984336] RDX: 0000000000000000 RSI: 00007ffd7ed2d1c0 RDI: 0000000000000003
[ 7265.985682] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000165c9a0
[ 7265.987021] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001
[ 7265.988309] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000

In sfb_change() function use qdisc_purge_queue() instead of
qdisc_tree_flush_backlog() to properly reset old child Qdisc and save
pointer to it into local temporary variable. Put reference to Qdisc after
sch tree lock is released in order not to call potentially sleeping cls API
in atomic section. This is safe to do because Qdisc has already been reset
by qdisc_purge_queue() inside sch tree lock critical section.

Reported-by: syzbot+ac54455281db908c581e@syzkaller.appspotmail.com
Fixes: c266f64 ("net: sched: protect block state with mutex")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
@plbossart
Copy link
Member

closing, this has to be redone separately

@plbossart plbossart closed this Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Unclear No agreement on problem statement and resolution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants