Skip to content

qcom-q6v5-mss crashing with "Invalid packet size 0" after several days of constant 4G use #25

@mlainez

Description

@mlainez

Hi,

Citronics has been using fairphone2 motherboards on custom carrier boards for some time now. On several deployed devices, we see that after using the modem for several days/weeks, it crashes then fails to be recovered in a working state. The only way to fully recover from the crash is to reboot the device.

Kernel version: 6.11.4
Linux flavour: PostmarketOS

Here's a log of the observed crash:

[May 07 23:40:24] daemon [1603]: <wrn> Cannot read from istream: connection broken 
[May 07 23:40:24] kern kernel: qcom-q6v5-mss fc880000.remoteproc: fatal error received: smd_dsm_memcpy.c:297:Invalid packet size 0
[May 07 23:40:24] daemon [1603]: <msg> [modem0] port 'wwan0qmi0' no longer controllable, reprobing 
[May 07 23:40:24] kern kernel: remoteproc remoteproc1: crash detected in fc880000.remoteproc: type fatal error
[May 07 23:40:24] daemon [1603]: <wrn> [modem0/bearer1] reloading stats failed: QMI operation failed: endpoint hangup 
[May 07 23:40:24] kern kernel: remoteproc remoteproc1: handling crash #1 in fc880000.remoteproc
[May 07 23:40:24] daemon NetworkManager[2021]: <info>  [1746654024.6169] device (wwan0qmi0): state change: activated -> unmanaged (reason 'unmanaged-link-not-init', managed-type: 'removed') 
[May 07 23:40:24] kern kernel: remoteproc remoteproc1: recovering fc880000.remoteproc
[May 07 23:40:24] daemon [1603]: <msg> [base-manager] port wwan0at0 released by device 'qcom-soc' 
[May 07 23:40:24] kern kernel: wwan wwan0: port wwan0at0 disconnected
[May 07 23:40:24] daemon [1603]: <msg> [base-manager] port wwan0at1 released by device 'qcom-soc' 
[May 07 23:40:24] kern kernel: wwan wwan0: port wwan0at1 disconnected
[May 07 23:40:24] daemon [1603]: <msg> [base-manager] port wwan0qmi0 released by device 'qcom-soc' 
[May 07 23:40:24] kern kernel: wwan wwan0: port wwan0qmi0 disconnected
[May 07 23:40:24] kern kernel: remoteproc remoteproc1: stopped remote processor fc880000.remoteproc
[May 07 23:40:24] kern kernel: qcom-q6v5-mss fc880000.remoteproc: MBA booted without debug policy, loading mpss
[May 07 23:40:26] kern kernel: remoteproc remoteproc1: remote processor fc880000.remoteproc is now up
[May 07 23:40:26] daemon dbus-daemon[1414]: [system] Activating service name='org.freedesktop.nm_dispatcher' requested by ':1.6' (uid=0 pid=2021 comm="/usr/sbin/NetworkManager -n") (using servicehelper) 
[May 07 23:40:26] daemon NetworkManager[2021]: <warn>  [1746654026.5169] modem-broadband[wwan0qmi0]: failed to disconnect modem: GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod: Object does not exist at path “/org/freedesktop/ModemManager1/Modem/0” 
[May 07 23:40:26] daemon [1603]: <msg> [device qcom-soc] creating modem with plugin 'qcom-soc' and '8' ports 
[May 07 23:40:26] daemon [1603]: <wrn> [device qcom-soc] could not recreate modem: Unsupported device: at least a QMI port is required 
[May 07 23:40:26] daemon dbus-daemon[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' 
[May 07 23:40:26] daemon nm-openvpn[2449]: SIGTERM[hard,] received, process exiting 
[May 07 23:40:26] daemon NetworkManager[2021]: <info>  [1746654026.5874] device (tun0): state change: activated -> unmanaged (reason 'unmanaged', managed-type: 'removed') 
[May 07 23:40:26] daemon NetworkManager[2021]: <info>  [1746654026.6158] manager: NetworkManager state is now CONNECTED_LOCAL 
[May 07 23:40:26] kern kernel: wwan wwan0: port wwan0at0 attached
[May 07 23:40:26] kern kernel: wwan wwan0: port wwan0at1 attached
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 0
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 1
[May 07 23:40:26] kern kernel: wwan wwan0: port wwan0qmi0 attached
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 2
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 3
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 4
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 5
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 6
[May 07 23:40:26] kern kernel: bam-dmux fc880000.remoteproc:bam-dmux: Channel already open: 7

From what I understand, the firmware receives a packet of size 0 and crashes. We are thinking about creating a "reset" script that is triggered whenever the crash is happening, but if there is an obvious way to fix this by submitting a PR, we're more than happy to do it.

Any pointers on how to solve this in the best possible way is welcome.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions