Skip to content

Conversation

@hgarrereyn
Copy link

This PR adds support for FrameShift (https://arxiv.org/pdf/2507.05421)

It can be enabled at runtime by setting the env var AFL_FRAMESHIFT_ENABLED=1.

Overview

FrameShift analysis runs once per newly discovered corpus entry in order to try to identify relation fields (i.e. size and offset fields) in the input format. If any are discovered, subsequent resizing mutations (inserts and deletes) will be tracked during the splicing and havoc phase in order to be able to fixup the corresponding relation fields.

The max analysis time per new input is controlled by the compile time definition FRAMESHIFT_TIME_BUDGET_MS (by default 2 seconds).

Evaluation

We evaluated AFL++ with and without FrameShift (on AFL++ v2.1c) on 16 targets and found that enabling FrameShift could confer statistically significant coverage increases of up to 60% (10 x 48 hour runs, p < 0.01) on a variety of targets, such as lcms, bloaty, ms-tpm-20-ref, woff2, libpcap, and openssl (refer to our paper for a full table). We also found that enabling FrameShift decreased time-to-trigger bugs on the Magma bug benchmark by about an hour on average during a 24 hour run (unpublished). Our paper eval ran AFL++ in FuzzBench configuration (with cmplog and dict2file).

On text-based formats, we don't expect FrameShift to find anything useful and the extra analysis time will likely slow down fuzzing. On the libxml2 target for example, enabling FrameShift decreased coverage by about 12% when starting from a seeded corpus.

Practical Usage Suggestions

Enabling FrameShift is likely the most useful when your target input format expects binary-serialized data that contains size or offset fields, especially when they are nested (i.e. most binary formats).

Since FrameShift analysis runs once-per corpus entry, analysis time will be largest either when starting with a large seed corpus or for targets with lots of coverage that grow quickly (i.e. harfbuzz/freetype2). Keep in mind that the initial startup speed may be slower with FrameShift enabled since it will spend time running analysis instead of directly fuzzing. We expect that in very long fuzz runs (i.e. a week or longer), the analysis time will amortize out as the coverage growth plateaus.

Code Changes

The core algorithm and implementation is in src/afl-fuzz-frameshift.c.

The main changes are:

  • A FrameShift mode which is enabled with AFL_FRAMESHIFT_ENABLED=1
  • The queue_entry structure is extended with a new fs_meta_t *fs_meta field pointing to the learned relation(s) for the input, and a u8 fs_status indicator.
  • When a new input is processed for the first time inside fuzz_one_original, it will run the frameshift_stage on the input to try to learn relations.
  • havoc and splice mutations are tracked and before running the input, the new relation values (if any) are re-serialized into the input.

If you are running with AFL_NO_UI=1, the logs will contain statistics about frameshift at the end of each line:

[*] Fuzzing test case #122 <snip> FS (t=0:00:00:15, st=63950, avg=504 ms, found=20/30)...

Here:

  • t: total analysis runtime of the FrameShift stage
  • st (search tests) number of times FrameShift ran a test input dynamically
  • avg average analysis time per input (upper bounded by FRAMESHIFT_TIME_BUDGET_MS)
  • found number of inputs FrameShift found at least one relation in over number of tested inputs

If the found ratio is very low, this likely means you are either starting from very unstructured data (i.e. an empty corpus) or you are fuzzing a format without size/offset fields.

@vanhauser-thc
Copy link
Member

I will need to do a fuzzbench run on this (dunno where, asking around currently).
what would be the most beneficial setup for frameshift? empty seeds or saturated corpus? with or without dictionary?

include/envs.h Outdated
"AFL_GCC_ONLY_FRSV", "AFL_SAN_RECOVER",
"AFL_PRELOAD_DISCRIMINATE_FORKSERVER_PARENT", "AFL_FORKSRV_UID",
"AFL_FORKSRV_GID", "AFL_COMPILER_LAUNCHER", NULL};
"AFL_FORKSRV_GID", "AFL_COMPILER_LAUNCHER", "AFL_FRAMESHIFT_ENABLED", NULL};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just AFL_FRAMESHIFT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we should add info about the env into the docs I think

@hgarrereyn
Copy link
Author

Most beneficial setup is likely empty corpus, running in fuzzbench configuration (cmplog + dictionaries).

We saw coverage improvements both from empty corpus and from a single seed corpus. The main adversarial setup would be if there are e.g. thousands of seed files which would incur a big initial analysis overhead.

I have access to a 128-core server, would be happy to run a full FuzzBench eval if you'd like.

@vanhauser-thc
Copy link
Member

Most beneficial setup is likely empty corpus, running in fuzzbench configuration (cmplog + dictionaries).

We saw coverage improvements both from empty corpus and from a single seed corpus. The main adversarial setup would be if there are e.g. thousands of seed files which would incur a big initial analysis overhead.

I have access to a 128-core server, would be happy to run a full FuzzBench eval if you'd like.

if you can, then please do!

this would be the setup:

experiment.tar.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants