Skip to content

Conversation

@RanderWang
Copy link
Collaborator

@RanderWang RanderWang commented Sep 8, 2023

please check #8143 and #8149

All CI multi-core tests pass on main branch!

@mengdonglin mengdonglin requested a review from fredoh9 September 8, 2023 04:25
@mengdonglin mengdonglin changed the title Port multicore fix to 006 branch [mtl-006] Port multicore fix to mtl-006 branch Sep 8, 2023
@mengdonglin mengdonglin changed the title [mtl-006] Port multicore fix to mtl-006 branch [mtl-006] Port multicore fix to mtl-006-drop-stable branch Sep 8, 2023
@mengdonglin mengdonglin requested a review from tmleman September 8, 2023 04:34
@mengdonglin
Copy link
Collaborator

@serhiy-katsyuba-intel @tmleman @abonislawski This PR includes #8149 on main branch to fix issue #8155 Can you please review?

@fredoh9
Copy link
Contributor

fredoh9 commented Sep 8, 2023

Build failed, can you fix the build errors? I can quickly verify if this really fix the multicore problem.

@fredoh9
Copy link
Contributor

fredoh9 commented Sep 8, 2023

I fixed the build error and checking with mtl-006-stable-drop

diff --git a/src/ipc/ipc4/handler.c b/src/ipc/ipc4/handler.c
index 845b71fa6b9c..443e593f0d76 100644
--- a/src/ipc/ipc4/handler.c
+++ b/src/ipc/ipc4/handler.c
@@ -67,7 +67,7 @@ static struct ipc4_msg_data msg_data;
 /* fw sends a fw ipc message to send the status of the last host ipc message */
 static struct ipc_msg msg_reply = {0, 0, 0, 0, LIST_INIT(msg_reply.list)};

-static struct ipc_msg msg_notify = {0, 0, 0, 0, LIST_INIT(msg_notify.list)}};
+static struct ipc_msg msg_notify = {0, 0, 0, 0, LIST_INIT(msg_notify.list)};

 /*
  * Global IPC Operations.

@fredoh9
Copy link
Contributor

fredoh9 commented Sep 8, 2023

unfortunately, different IPC timeout came up when GLB_CREATE_PIPELINE

[  140.239572] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: COPIER (UUID: 9BA00C83-CA12-4A83-943C-1FA2E82F9DDA): No CPC match in the firmware file's manifest (ibs/obs: 384/192)
[  140.239576] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Please try to update the firmware.
[  140.239578] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: If the issue persists, file a bug at
[  140.239579] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: https://github.com/thesofproject/sof/issues/
[  140.239581] kernel: snd_sof:sof_ipc4_update_resource_usage: sof-audio-pci-intel-mtl 0000:00:1f.3: host-copier.0.capture: ibs / obs / cpc: 384 / 192 / 0
[  140.239585] kernel: snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-mtl 0000:00:1f.3: pipeline: 8 memory pages: 2
[  140.239589] kernel: snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-mtl 0000:00:1f.3: Create widget pipeline.8 instance 0 - pipe 8 - core 0
[  140.239594] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x11000002|0x0: GLB_CREATE_PIPELINE
[  140.739890] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0x11000002|0x0
[  140.739906] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
[  140.739925] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: failed to create module pipeline.8
[  140.739931] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
[  140.739941] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 0 dir 1
[  140.739946] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -110
[  140.739973] kernel:  Port0: ASoC: error at __soc_pcm_hw_params on Port0: -110

dmesg.txt

@RanderWang
Copy link
Collaborator Author

Let me check 006, I only tested main. Thanks!

@fredoh9
Copy link
Contributor

fredoh9 commented Sep 8, 2023

with 2nd push, there is no build error, but back to MOD_SET_DX error

[   60.033925] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[   60.535638] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0x47000000|0x0
[   60.535653] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Attempting to prevent DSP from entering D3 state to preserve context
[   60.535657] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump start ]------------
[   60.535671] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Host IPC initiator: 0x47000000|0x0|0x0, target: 0x67000000|0x0|0x0, ctl: 0x3
[   60.535675] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump end ]------------
[   60.535678] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump start ]------------
[   60.535680] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
[   60.535683] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[   60.535692] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ROM status: 0x5, ROM error: 0x0
[   60.535695] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ROM debug status: 0x50000005, ROM debug error: 0x0
[   60.535701] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ROM feature bit enabled
[   60.535704] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump end ]------------
[   60.535722] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed to enable target core for widget pipeline.3
[   60.535726] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
[   60.535736] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 1 dir 0
[   60.535741] kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -110
[   60.535763] kernel:  Port1: ASoC: error at __soc_pcm_hw_params on Port1: -110
[   60.535773] kernel:  Port1: ASoC: error at dpcm_fe_dai_hw_params on Port1: -110
[   60.535780] kernel: snd_sof:sof_pcm_hw_free: sof-audio-pci-intel-mtl 0000:00:1f.3: pcm: free stream 1 dir 0
[   60.536233] kernel: snd_sof:sof_pcm_close: sof-audio-pci-intel-mtl 0000:00:1f.3: pcm: close stream 1 dir 0

@RanderWang
Copy link
Collaborator Author

Add more debug method and find the fw hang after set_dx to disable secondary core. It will be always failed with create_pipeline after set_dx

@lgirdwood
Copy link
Member

@mwasko @marcinszkudlinski fyi

Copy link
Contributor

@tmleman tmleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwasko @marcinszkudlinski fyi merged into main last week and results good.

@mwasko
Copy link
Contributor

mwasko commented Sep 13, 2023

@RanderWang there is quite a lof of CI sof-ci/jenkins/pr-device-test/main-ace failures on MTL, can you clarify this?

@RanderWang
Copy link
Collaborator Author

The FW building was failed with error but it has nothing to do with my PR.

/zep_workspace/sof/src/schedule/zephyr_domain.c:26:10: fatal error: zephyr/timeout_q.h: No such file or directory
   26 | #include <zephyr/timeout_q.h>
      |          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.

@RanderWang
Copy link
Collaborator Author

SOFCI TEST

In multicore case, IPC message is dispatched from primary core to
secondary core which send reply message to host. Primary core will
do nothing if IPC_TASK_SECONDARY_CORE is set. But in rare case, the
secondary code finish the reply message and clear this flag before
the ipc thread in primary core check this flag, then primary core
also send reply message again. This results to the reply message being
inserted two times in ipc message list and infinite loop when visiting
the list.

We don't need to init reply message since it is initialized after
deleting from the ipc list.

Signed-off-by: Rander Wang <rander.wang@intel.com>
Use list_is_empty to check the message is queued or not. The notify
message is initialized to empty after deleting from the ipc msg list.
We use the same idea in ipc_msg_send.

Signed-off-by: Rander Wang <rander.wang@intel.com>
@RanderWang
Copy link
Collaborator Author

the error only happened on mtl-sdw device and it is a regression issue on 006branch. @keqiaozhang is working on it.

@abonislawski abonislawski merged commit 3b78ccd into thesofproject:mtl-006-drop-stable Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants