linux-cachyos 7.0.1: NULL deref in select_task_rq_fair() from pollwake/__wake_up_sync_key → kernel zombie task → full system freeze

# Kernel NULL deref in `select_task_rq_fair` from `pollwake` path → unrecoverable system freeze

**Full kernel trace (gist):** https://gist.github.com/Evil-Overlord-666/cb5f046c88894ea16d0473f0b38d1834

---

## Summary

On 2026-04-30 at 19:51:31 a `BUG: kernel NULL pointer dereference, address: 0x0000000000000044`
fired on CPU 4 inside `select_task_rq_fair()`, called via the wake-up path
`unix_stream_sendmsg → sock_def_readable → __wake_up_sync_key → pollwake → try_to_wake_up → select_task_rq`.
Process context was Xwayland (PID 2504, UID 1000) running a `writev(2)` syscall on a unix-domain socket.

The Oops printed `note: Xwayland[2504] exited with irqs disabled` and
`exited with preempt_count 3`. Because the task died with IRQs disabled and
preempt_count > 0, the kernel could not finish reaping it. From 19:52:31 onward
the same task remained pinned on CPU 13 spinning in
`native_queued_spin_lock_slowpath` inside `do_exit → __fput → sock_close →
unix_release_sock → sock_def_wakeup → __wake_up`. RCU stall warnings
("rcu_preempt detected stalls on CPUs/tasks ... P2504") repeated every ~3 minutes
until I hard-rebooted at 23:24 — roughly **3.5 hours of escalating freeze**.

There appear to be **two distinct bugs** here:

1. **Primary:** the NULL deref in `select_task_rq_fair` reachable from the
   `pollwake` path on a normal `writev` to a unix socket. CR2 = 0x44 suggests a
   small offset into a NULL `task_struct` / `rq` / `cfs_rq` pointer.
2. **Secondary (recovery):** Oops handler does not safely tear down a task that
   crashed with `irqs_disabled() && preempt_count > 0`. The task's outstanding
   spinlock is never released, so the entire system grinds to a halt instead of
   just losing Xwayland.

## Reproducibility

Once, observed in the wild. Not deliberately reproducible. The system had been
up for ~22 hours and was logged in to a Plasma 6.6.4 Wayland session; I was
away. No special workload at the moment of crash.

## System

| Field | Value |
|---|---|
| Distro | CachyOS (rolling) |
| Kernel | `7.0.1-1-cachyos #1 SMP PREEMPT Thu, 23 Apr 2026 21:04:50 +0000 x86_64` |
| Build hash | `064ce857db72d62f7ca6e6781b81b6ace6735267` |
| Compiler | clang 22.1.3 (kernel built with LLVM/clang) |
| Cmdline | `quiet nowatchdog splash rw rootflags=subvol=/@ root=UUID=...` |
| Init | systemd v260 |
| DE | KDE Plasma 6.6.4, kwin_wayland, Xwayland 24.1.10 |
| CPU | Intel Core i9-13900KF (8P+16E, 32 threads) |
| Board | MSI MEG Z790 ACE (MS-7D86), BIOS 1.F0 (2025-08-07) |
| RAM | 64 GiB DDR5 |
| GPU | NVIDIA RTX 4090, proprietary driver 595.58.03 |
| Out-of-tree modules | `nvidia(O)`, `nvidia_drm(O)`, `nvidia_modeset(O)`, `nvidia_uvm(O)`, `razerkbd(OE)`, `razermouse(OE)` |
| Root FS | btrfs on /dev/sdd2 (CachyOS install) |

Kernel taint at time of Oops: `G OE` (out-of-tree + unsigned modules).
Note: the fault site itself is in mainline scheduler code (`select_task_rq_fair`), not in any of the
out-of-tree modules. NVIDIA + Razer modules are loaded but do not appear in the call stack.

## Primary Oops (verbatim, abbreviated — full trace in the linked gist above)

```
Apr 30 19:51:31 ArchEnemy kernel: BUG: kernel NULL pointer dereference, address: 0000000000000044
Apr 30 19:51:31 ArchEnemy kernel: #PF: supervisor read access in kernel mode
Apr 30 19:51:31 ArchEnemy kernel: #PF: error_code(0x0000) - not-present page
Apr 30 19:51:31 ArchEnemy kernel: PGD 1dd8dc067 P4D 1dd8dc067 PUD 0
Apr 30 19:51:31 ArchEnemy kernel: Oops: Oops: 0000 [#1] SMP NOPTI
Apr 30 19:51:31 ArchEnemy kernel: CPU: 4 UID: 1000 PID: 2504 Comm: Xwayland Tainted: G           OE       7.0.1-1-cachyos #1 PREEMPT  064ce857db72d62f7ca6e6781b81b6ace6735267
Apr 30 19:51:31 ArchEnemy kernel: Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Apr 30 19:51:31 ArchEnemy kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D86/MEG Z790 ACE (MS-7D86), BIOS 1.F0 08/07/2025
Apr 30 19:51:31 ArchEnemy kernel: RIP: 0010:select_task_rq_fair.llvm.18029127130511906342+0x2cdd/0x2d30
Apr 30 19:51:31 ArchEnemy kernel: RSP: 0018:ffffcbb785787758 EFLAGS: 00010002
Apr 30 19:51:31 ArchEnemy kernel: RAX: 0000000000000001 RBX: ffff8afaca0efe00 RCX: 0000000000000001
Apr 30 19:51:31 ArchEnemy kernel: RDX: 0000000000000002 RSI: 0000000000000020 RDI: ffff8afad4155280
Apr 30 19:51:31 ArchEnemy kernel: RBP: 0000000000000018 R08: 0000000000000000 R09: ffff8afac0dd5280
Apr 30 19:51:31 ArchEnemy kernel: R10: 0000000000004000 R11: 0000000000000000 R12: 0000000000000004
Apr 30 19:51:31 ArchEnemy kernel: R13: 0000000000000004 R14: ffffcbb785787800 R15: 0000000000000000
Apr 30 19:51:31 ArchEnemy kernel: FS:  00007fd821c9ba00(0000) GS:ffff8b0a81f91000(0000) knlGS:0000000000000000
Apr 30 19:51:31 ArchEnemy kernel: CR2: 0000000000000044 CR3: 0000000120956002 CR4: 0000000000f72ef0
Apr 30 19:51:31 ArchEnemy kernel: Call Trace:
Apr 30 19:51:31 ArchEnemy kernel:  <TASK>
Apr 30 19:51:31 ArchEnemy kernel:  ? obj_cgroup_charge_account.llvm.8602441956471480031+0x131/0x150
Apr 30 19:51:31 ArchEnemy kernel:  ? __memcg_slab_post_alloc_hook+0x304/0x3a0
Apr 30 19:51:31 ArchEnemy kernel:  select_task_rq+0x81/0xe0
Apr 30 19:51:31 ArchEnemy kernel:  try_to_wake_up+0x258/0x6a0
Apr 30 19:51:31 ArchEnemy kernel:  pollwake+0xa1/0xd0
Apr 30 19:51:31 ArchEnemy kernel:  ? __pfx_default_wake_function+0x10/0x10
Apr 30 19:51:31 ArchEnemy kernel:  __wake_up_sync_key+0x65/0xa0
Apr 30 19:51:31 ArchEnemy kernel:  sock_def_readable+0x44/0xd0
Apr 30 19:51:31 ArchEnemy kernel:  unix_stream_sendmsg+0x1d1/0x7e0
Apr 30 19:51:31 ArchEnemy kernel:  __sock_sendmsg+0x6f/0x90
Apr 30 19:51:31 ArchEnemy kernel:  sock_write_iter+0xee/0x140
Apr 30 19:51:31 ArchEnemy kernel:  do_iter_readv_writev+0x18e/0x1f0
Apr 30 19:51:31 ArchEnemy kernel:  vfs_writev+0x1db/0x410
Apr 30 19:51:31 ArchEnemy kernel:  do_writev+0x76/0x110
Apr 30 19:51:31 ArchEnemy kernel:  do_syscall_64+0x111/0xa50
Apr 30 19:51:31 ArchEnemy kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Apr 30 19:51:31 ArchEnemy kernel: ---[ end trace 0000000000000000 ]---
Apr 30 19:51:31 ArchEnemy kernel: note: Xwayland[2504] exited with irqs disabled
Apr 30 19:51:31 ArchEnemy kernel: note: Xwayland[2504] exited with preempt_count 3
```

## Secondary failure — RCU stall / queued-spinlock deadlock during reap

After the Oops, the task did not finish dying. Every ~3 minutes:

```
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu:         13-...0: (1 ticks this GP) idle=b2e4/1/0x4000000000000000 softirq=180214/180214 fqs=...
rcu:         Tasks blocked on level-1 rcu_node (CPUs 0-15): P2504
rcu:         (detected by 27, t=12300406 jiffies, g=6504897, q=119869 ncpus=32)
Sending NMI from CPU 27 to CPUs 13:
RIP: 0010:native_queued_spin_lock_slowpath+0x9d/0x2d0
Call Trace:
  _raw_spin_lock_irqsave+0x3e/0x50
  __wake_up+0x27/0xb0
  sock_def_wakeup+0x3f/0x50
  unix_release_sock+0x225/0x400
  unix_release+0x34/0x50
  sock_close+0x47/0xd0
  __fput+0xf9/0x280
  task_work_run+0x9d/0xc0
  do_exit+0x32a/0xaa0
  make_task_dead+0x80/0x150
  rewind_stack_and_make_dead+0x16/0x20
```

The same PID (2504) is still on the CPU 4 hours later, holding a queued
spinlock that was acquired before the original Oops and never released because
the Oops handler returned with `irqs_disabled() && preempt_count == 3`.

Stall counter `t=12300406 jiffies` ≈ 12,300 seconds ≈ 3 h 25 min — the entire
time the I was away from the machine.

## What I think is going on (best-effort, not authoritative)

- `pollwake()` calls `try_to_wake_up()` for the task on the other end of the
  poll wait queue. `try_to_wake_up` calls `select_task_rq()` which dispatches
  to `select_task_rq_fair()` for `SCHED_NORMAL` tasks.
- The fault is at `select_task_rq_fair+0x2cdd/0x2d30` (very near the function
  end), reading address 0x44. CR2 = 0x44 is consistent with dereferencing a
  small structure offset off a NULL base — most likely a `task_struct` /
  `sched_entity` / `cfs_rq` pointer that became NULL (use-after-free on the
  poll wait entry's task pointer? task exited concurrently?).
- The Oops occurs while holding the wake queue's spinlock (taken in
  `__wake_up_sync_key`). When `Xwayland[2504] exited with irqs disabled`, that
  spinlock is leaked. Subsequent attempts by *any* CPU to take that lock will
  spin forever in the queued-spinlock slowpath, which is exactly what we see
  later.

CachyOS kernel uses BORE/EEVDF/sched-ext patches in addition to clang LTO; the
fault is in `select_task_rq_fair`, which is fair-class code, so this could be:

- a generic upstream bug also present in mainline,
- specific to a CachyOS scheduler patch,
- or interaction with the LLVM-built kernel layout (the `.llvm.<hash>` symbol
  suffix indicates clang's local-symbol mangling).

I have no way to disambiguate without testing on a `linux-cachyos-lts` (6.18.22)
or a vanilla mainline kernel. I'm staying on 7.0.1 for now, so reproduction
data may follow if it recurs.

## Asks

1. Has anyone else hit `select_task_rq_fair` NULL derefs on 7.0.1?
2. Does CachyOS apply patches to `select_task_rq_fair` / EEVDF / BORE that
   could put this fault offset (`+0x2cdd`) in a known region?
3. The "exited with irqs disabled / preempt_count 3" -> RCU stall pattern is
   really the user-visible bug. Even if the primary deref is rare, the recovery
   path turning a single-task crash into a system-wide freeze is a separate
   robustness issue worth flagging.

---

*Reporter:* @Evil-Overlord-666 (will update if/when the issue is opened)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

linux-cachyos 7.0.1: NULL deref in select_task_rq_fair() from pollwake/__wake_up_sync_key → kernel zombie task → full system freeze #828

Kernel NULL deref in `select_task_rq_fair` from `pollwake` path → unrecoverable system freeze

Summary

Reproducibility

System

Primary Oops (verbatim, abbreviated — full trace in the linked gist above)

Secondary failure — RCU stall / queued-spinlock deadlock during reap

What I think is going on (best-effort, not authoritative)

Asks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Field	Value
Distro	CachyOS (rolling)
Kernel	`7.0.1-1-cachyos #1 SMP PREEMPT Thu, 23 Apr 2026 21:04:50 +0000 x86_64`
Build hash	`064ce857db72d62f7ca6e6781b81b6ace6735267`
Compiler	clang 22.1.3 (kernel built with LLVM/clang)
Cmdline	`quiet nowatchdog splash rw rootflags=subvol=/@ root=UUID=...`
Init	systemd v260
DE	KDE Plasma 6.6.4, kwin_wayland, Xwayland 24.1.10
CPU	Intel Core i9-13900KF (8P+16E, 32 threads)
Board	MSI MEG Z790 ACE (MS-7D86), BIOS 1.F0 (2025-08-07)
RAM	64 GiB DDR5
GPU	NVIDIA RTX 4090, proprietary driver 595.58.03
Out-of-tree modules	`nvidia(O)`, `nvidia_drm(O)`, `nvidia_modeset(O)`, `nvidia_uvm(O)`, `razerkbd(OE)`, `razermouse(OE)`
Root FS	btrfs on /dev/sdd2 (CachyOS install)

Uh oh!

linux-cachyos 7.0.1: NULL deref in select_task_rq_fair() from pollwake/__wake_up_sync_key → kernel zombie task → full system freeze #828

Description

Kernel NULL deref in select_task_rq_fair from pollwake path → unrecoverable system freeze

Summary

Reproducibility

System

Primary Oops (verbatim, abbreviated — full trace in the linked gist above)

Secondary failure — RCU stall / queued-spinlock deadlock during reap

What I think is going on (best-effort, not authoritative)

Asks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Kernel NULL deref in `select_task_rq_fair` from `pollwake` path → unrecoverable system freeze