Skip to content

[BUG] SOF fails to load on MTL (with Linux 6.8 or older driver) #9243

@andyross

Description

@andyross

SOF firmware built from HEAD isn't loading for us on MTL since the last Zephyr west.yml update. The symptom is that it hangs on load with a complaint in early load about the rom_status_reg not showing an update:

[   46.688586] sof-audio-pci-intel-mtl 0000:00:1f.3: hda_cl_copy_fw: timeout with rom_status_reg (0x180000) read
[   46.688606] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump start ]------------
[   46.688611] sof-audio-pci-intel-mtl 0000:00:1f.3: Firmware download failed
[   46.688615] sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state: SOF_FW_BOOT_READY_OK (6)
[   46.688628] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM status: 0x0, ROM error: 0x0
[   46.688632] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM debug status: 0x50000005, ROM debug error: 0x0
[   46.688642] sof-audio-pci-intel-mtl 0000:00:1f.3: ROM feature bit not enabled
[   46.688646] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump end ]------------
[   46.688665] sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to start DSP
[   46.688669] sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed to boot DSP firmware -110
[   46.690414] sof-audio-pci-intel-mtl 0000:00:1f.3: error: sof_probe_work failed err: -110

The culprit is this Zephyr commit from a few weeks back, which claims that it's disabled the FW_STATUS output deliberately because it's not needed on ACE 1.x:

zephyrproject-rtos/zephyr@fa798ce

commit fa798ce2d5be6deb8a3b0cde307a0a1dc5e88dde
Author: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Date:   Thu May 23 12:09:05 2024 +0300

    soc: intel_adsp: only implement FW_STATUS boot protocol for cavs

    The software protocol to write status value of 0x05 (FW_ENTERED)
    into memory window 0 at Zephyr boot, is not needed in the ace1.x
    boot flow and does not match the semantics host systems are expecting
    at this location in the memory window (e.g. write of 0x05 is not
    expected).

    Make this logic specific to intel_adsp_cavs platforms and move the code
    out from common intel_adsp code.

    This commit depends on update to cavstool.py to use correct
    ROM status register to observe boot state.

    Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>

Nonetheless, the kernel seems to still care. The timeout in question is in hda-loader.c: https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/sof/intel/hda-loader.c#L293 and it's quite clearly still polling this address.

Reverting just that one patch from the Zephyr module fixes load for us.

Not sure why this isn't being hit upstream? Is there another MTL loader path that should be hit instead, maybe due to something skewed in our kernel. That link is to the topic/sof-dev branch and matches the ChromeOS kernels and upstream, so it's not as simple as a missing update to this code.

Metadata

Metadata

Assignees

Labels

P1Blocker bugs or important featuresbugSomething isn't working as expected

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions