Skip to content

Conversation

@ujfalusi
Copy link
Collaborator

Hi,

The reason for UPX-i11 failures in suspend/resume tests are caused by TPM interrupts being enabled now on the machines as it appears to work while they are not.
It is likely that the IRQ line is floating, but it is not known.

Anyways, disabling IRQ mode and using polling (as it has been the case) fixes the issues.

Picking a lenovo patch as well to avoid future merge issue when we get the patches back from mainline.

snits and others added 3 commits May 25, 2023 11:51
The P360 Tiny suffers from an irq storm issue like the T490s, so add
an entry for it to tpm_tis_dmi_table, and force polling. There also
previously was a report from the previous attempt to enable interrupts
that involved a ThinkPad L490. So an entry is added for it as well.

Cc: stable@vger.kernel.org
Reported-by: Peter Zijlstra <peterz@infradead.org> # P360 Tiny
Closes: https://lore.kernel.org/linux-integrity/20230505130731.GO83892@hirez.programming.kicks-ass.net/
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
(cherry picked from commit e7d3e5c)
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Interrupts got recently enabled for tpm_tis.

The interrupts initially works on the device but they will stop arriving
after circa ~200 interrupts. On system reboot/shutdown this will cause a
long wait (120000 jiffies).

[jarkko@kernel.org: fix a merge conflict and adjust the commit message]
Fixes: e644b2f ("tpm, tpm_tis: Enable interrupt test")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
(cherry picked from commit 95a9359)
The original patch which added the quirk would apply to all AAEON machines,
which might or might not be valid.

The issue was discovered on UPX-i11 (Tiger Lake), it is not known if the
i12 (Alder Lake) version is affected.
UP2 (Apollo Lake) does not even have TPM module (no TPM drivers probing
and confirmed by dmidecode).

Let's make the quirk to be applicable for UPX-i11 (UPX-TGL01) only.

Fixes: 95a9359 ("tpm: tpm_tis: Disable interrupts for AEON UPX-i11")
Suggested-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
@marc-hb
Copy link
Collaborator

marc-hb commented May 25, 2023

SOFCI TEST

@marc-hb
Copy link
Collaborator

marc-hb commented May 25, 2023

https://sof-ci.01.org/linuxpr/PR4377/build5112/devicetest/index.html and https://sof-ci.01.org/linuxpr/PR4377/build5113/devicetest/index.html were affected by ERROR: Module regmap_sdw is in use by: snd_soc_rt712_sdca_dmic snd_soc_rt712_sdca issue thesofproject/sof-test#1039. Re-running.

@marc-hb
Copy link
Collaborator

marc-hb commented May 25, 2023

Looks better now except https://sof-ci.01.org/linuxpr/PR4377/build5159/devicetest/index.html?model=GLK_BOB_DA7219&testcase=check-kmod-load-unload has a NULL dereference

[ 1474.804752] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-apl 0000:00:0e.0: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 1474.804921] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-apl 0000:00:0e.0: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1474.940366] kernel: general protection fault, probably for non-canonical address 0xdead0000000000f0: 0000 [#1] PREEMPT SMP PTI
[ 1474.940392] kernel: CPU: 2 PID: 4445 Comm: irq/96-da7219-a Not tainted 6.4.0-rc1-pr4377-5153-default-g40a7258206b4 #40a72582
[ 1474.940407] kernel: Hardware name: Google Bobba/Bobba, BIOS Google_Bobba.11825.0.2019_03_06_2015 03/06/2019
[ 1474.940415] kernel: RIP: 0010:dapm_find_widget+0x74/0xe0 [snd_soc_core]
[ 1474.940532] kernel: Code: 00 48 89 e7 e8 dd 34 26 e4 49 89 e7 48 8b 5d 18 45 31 ed 48 8b 83 c0 03 00 00 48 81 c3 c0 03 00 00 4c 8d 60 e8 48 39 c3 74 29 <49> 8b 7c 24 08 4c 89 fe e8 af ca 25 e4 85 c0 75 0a 49 39 6c 24 28
[ 1474.940543] kernel: RSP: 0018:ffffb3c2c5097d78 EFLAGS: 00010202
[ 1474.940555] kernel: RAX: dead000000000100 RBX: ffffffffc1135520 RCX: 0000000000000000
[ 1474.940564] kernel: RDX: 0000000000000069 RSI: ffffffffc083cbcd RDI: ffffa3e046530f90
[ 1474.940572] kernel: RBP: ffffa3e0020c6168 R08: 0000000000000001 R09: 0000000000000000
[ 1474.940580] kernel: R10: ffffb3c2c5097e30 R11: 0000000000000000 R12: dead0000000000e8
[ 1474.940587] kernel: R13: 0000000000000000 R14: 0000000000000001 R15: ffffffffc083cbcd
[ 1474.940595] kernel: FS:  0000000000000000(0000) GS:ffffa3e177a00000(0000) knlGS:0000000000000000
[ 1474.940604] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1474.940612] kernel: CR2: 000055774f390e68 CR3: 000000011218a000 CR4: 0000000000350ee0
[ 1474.940621] kernel: Call Trace:
[ 1474.940633] kernel:  
[ 1474.940642] kernel:  ? __mutex_unlock_slowpath+0x45/0x280
[ 1474.940665] kernel:  ? snd_soc_dapm_disable_pin+0x28/0x60 [snd_soc_core]
[ 1474.940776] kernel:  __snd_soc_dapm_set_pin+0x1d/0xe0 [snd_soc_core]
[ 1474.940881] kernel:  snd_soc_dapm_disable_pin+0x35/0x60 [snd_soc_core]
[ 1474.940984] kernel:  da7219_aad_irq_thread+0x2de/0x320 [snd_soc_da7219]
[ 1474.941017] kernel:  ? __pfx_irq_thread_fn+0x10/0x10
[ 1474.941031] kernel:  irq_thread_fn+0x21/0x60
[ 1474.941043] kernel:  ? irq_thread+0xb9/0x200
[ 1474.941054] kernel:  irq_thread+0x107/0x200
[ 1474.941064] kernel:  ? __kthread_parkme+0x1e/0xa0
[ 1474.941076] kernel:  ? __pfx_irq_thread+0x10/0x10
[ 1474.941087] kernel:  ? __pfx_irq_thread_dtor+0x10/0x10
[ 1474.941099] kernel:  ? __pfx_irq_thread+0x10/0x10
[ 1474.941109] kernel:  kthread+0xf1/0x130
[ 1474.941122] kernel:  ? __pfx_kthread+0x10/0x10
[ 1474.941135] kernel:  ret_from_fork+0x29/0x50
[ 1474.941158] kernel:  

@plbossart
Copy link
Member

Looks better now except https://sof-ci.01.org/linuxpr/PR4377/build5159/devicetest/index.html?model=GLK_BOB_DA7219&testcase=check-kmod-load-unload has a NULL dereference

[ 1474.804752] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-apl 0000:00:0e.0: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 1474.804921] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-apl 0000:00:0e.0: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1474.940366] kernel: general protection fault, probably for non-canonical address 0xdead0000000000f0: 0000 [#1] PREEMPT SMP PTI
[ 1474.940392] kernel: CPU: 2 PID: 4445 Comm: irq/96-da7219-a Not tainted 6.4.0-rc1-pr4377-5153-default-g40a7258206b4 #40a72582
[ 1474.940407] kernel: Hardware name: Google Bobba/Bobba, BIOS Google_Bobba.11825.0.2019_03_06_2015 03/06/2019
[ 1474.940415] kernel: RIP: 0010:dapm_find_widget+0x74/0xe0 [snd_soc_core]
[ 1474.940532] kernel: Code: 00 48 89 e7 e8 dd 34 26 e4 49 89 e7 48 8b 5d 18 45 31 ed 48 8b 83 c0 03 00 00 48 81 c3 c0 03 00 00 4c 8d 60 e8 48 39 c3 74 29 <49> 8b 7c 24 08 4c 89 fe e8 af ca 25 e4 85 c0 75 0a 49 39 6c 24 28
[ 1474.940543] kernel: RSP: 0018:ffffb3c2c5097d78 EFLAGS: 00010202
[ 1474.940555] kernel: RAX: dead000000000100 RBX: ffffffffc1135520 RCX: 0000000000000000
[ 1474.940564] kernel: RDX: 0000000000000069 RSI: ffffffffc083cbcd RDI: ffffa3e046530f90
[ 1474.940572] kernel: RBP: ffffa3e0020c6168 R08: 0000000000000001 R09: 0000000000000000
[ 1474.940580] kernel: R10: ffffb3c2c5097e30 R11: 0000000000000000 R12: dead0000000000e8
[ 1474.940587] kernel: R13: 0000000000000000 R14: 0000000000000001 R15: ffffffffc083cbcd
[ 1474.940595] kernel: FS:  0000000000000000(0000) GS:ffffa3e177a00000(0000) knlGS:0000000000000000
[ 1474.940604] kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1474.940612] kernel: CR2: 000055774f390e68 CR3: 000000011218a000 CR4: 0000000000350ee0
[ 1474.940621] kernel: Call Trace:
[ 1474.940633] kernel:  
[ 1474.940642] kernel:  ? __mutex_unlock_slowpath+0x45/0x280
[ 1474.940665] kernel:  ? snd_soc_dapm_disable_pin+0x28/0x60 [snd_soc_core]
[ 1474.940776] kernel:  __snd_soc_dapm_set_pin+0x1d/0xe0 [snd_soc_core]
[ 1474.940881] kernel:  snd_soc_dapm_disable_pin+0x35/0x60 [snd_soc_core]
[ 1474.940984] kernel:  da7219_aad_irq_thread+0x2de/0x320 [snd_soc_da7219]
[ 1474.941017] kernel:  ? __pfx_irq_thread_fn+0x10/0x10
[ 1474.941031] kernel:  irq_thread_fn+0x21/0x60
[ 1474.941043] kernel:  ? irq_thread+0xb9/0x200
[ 1474.941054] kernel:  irq_thread+0x107/0x200
[ 1474.941064] kernel:  ? __kthread_parkme+0x1e/0xa0
[ 1474.941076] kernel:  ? __pfx_irq_thread+0x10/0x10
[ 1474.941087] kernel:  ? __pfx_irq_thread_dtor+0x10/0x10
[ 1474.941099] kernel:  ? __pfx_irq_thread+0x10/0x10
[ 1474.941109] kernel:  kthread+0xf1/0x130
[ 1474.941122] kernel:  ? __pfx_kthread+0x10/0x10
[ 1474.941135] kernel:  ret_from_fork+0x29/0x50
[ 1474.941158] kernel:  

Known issue @marc-hb, this is completely unrelated. It's been more than 2 years and we could never root-cause this one, see #2676

@ujfalusi
Copy link
Collaborator Author

ujfalusi commented May 29, 2023

picking up to remove the TPM from the possible sources of suspend/resume fails.

@ujfalusi ujfalusi merged commit a714334 into thesofproject:topic/sof-dev May 29, 2023
@ujfalusi ujfalusi deleted the peter/sof/pr/upx-i11_tpm_fix_01 branch November 30, 2023 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants