fuzz: Add structured graph fuzz harness and fix folding overflows it found by weltling · Pull Request #149 · dstogov/ir

weltling · 2026-06-15T22:08:29Z

This adds a second libFuzzer harness that fuzzes the IR builder and optimization pipeline directly, plus two folding fixes the harness found.

Harness

The existing harness fuzz_ir.c feeds raw bytes to the text IR parser, so most mutated inputs are rejected before reaching the optimizer. The new harness fuzz_graph.c decodes each input blob into a structurally valid IR function instead, so inputs reach the optimizer and the code emitter. This is the core of the harness. Later features like loops and conditions will be built on top of it.

Decoding maps the leading bytes to a working type and parameter count, then turns each following group of bytes into one type preserving operation over earlier nodes, ending with a return of the last value. The build targets fuzz-graph-O0, fuzz-graph-O1 and fuzz-graph-O2 select the optimization level. The existing fuzz_ir.c is renamed to fuzz_text.c so the pair reads as text versus graph.

Fixes found

MUL(C_U16, C_U16). Both uint16_t operands promote to int, so the product overflowed signed int. Multiply through uint32_t.
MUL/DIV(NEG, C_I). The constant was negated on a signed int64_t, which is undefined for INT64_MIN. Negate through uint64_t. The release result is unchanged since the negation wraps to itself.

Both were caught by the UBSan build of the harness, and the folded output is identical on a release build.

dstogov

The ir_fold.h fixes look right. Thanks!

According to fuzzer, it would be great to completely separate it, moving everything related from Makefile to fuzz/Mafefile. Is this difficult to do?

In case some day fuzzer is going to execute generated code, it should do this inside a container... This is not necessary right now.

weltling · 2026-06-18T08:59:48Z

The ir_fold.h fixes look right. Thanks!

According to fuzzer, it would be great to completely separate it, moving everything related from Makefile to fuzz/Mafefile. Is this difficult to do?

In case some day fuzzer is going to execute generated code, it should do this inside a container... This is not necessary right now.

Thanks for the review.

Yep, I'm going to restructure all the fuzz code under the subfolder then, which will make it cleaner and not interfere with the production code.

Thanks

Keep the fuzzing tooling self contained so the production build never pulls in clang or the sanitizer runtimes. The new fuzz/Makefile detects the target, generates ir_fold_hash.h and the dynasm encoder header on its own and builds every harness under fuzz/build. Run the fuzz targets from the fuzz directory now. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>

The harness that drives the text parser was named fuzz_ir.c, which did not say what set it apart since every harness here fuzzes the IR. Name the source and its build targets after the input format. The source becomes fuzz_text.c and the targets become fuzz-text-load and fuzz-text-O0 through fuzz-text-O2, so the text harness reads apart from the graph harness added next. Keep a single seed corpus for the text targets instead of a directory per optimization level. make_corpus.sh now writes the seeds to corpus/text, which every text target shares, and corpus-clean only removes those regenerable seeds. When no positional path is given on the command line the harness falls back to corpus/text and creates it on demand; an explicit path, such as a single crash file when reproducing, is left untouched. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>

This harness decodes the input blob directly into a structurally valid IR function and runs the full optimization and codegen pipeline on it, so inputs exercise the optimizer and backend. The input bytes are turned into a valid IR function step by step. The first byte picks the type and parameter count, then each small group of bytes adds one operation that uses earlier nodes as inputs. The last node becomes the return value. The optimization level is selected at compile time via FUZZ_GRAPH_O0, FUZZ_GRAPH_O1 or FUZZ_GRAPH_O2. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>

Point every command at the fuzz directory now that the build lives in fuzz/Makefile and add the fuzz-graph targets to the quick start, the target table and the crash reproduction notes. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>

The two u16 operands are promoted to signed int before the multiply, so a wide product such as 54300 * 54300 overflows int. The result is truncated back to u16 and is correct on any two complement target, but it is still undefined behavior. Cast both operands to uint32_t so the multiply is unsigned. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>

The rules MUL(NEG(x), c) and DIV(NEG(x), c) negate the constant on a signed int64_t, which is undefined behavior for INT64_MIN. Negate through uint64_t instead. The result is unchanged on a release build since the negation of INT64_MIN wraps to itself. Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>

weltling · 2026-06-21T09:51:35Z

Implemented the suggested change. Everything fuzz is now encapsulated in the subfolder and rebuild on its own. All the other details from the PR description are still valid, too. A simple command sequence to check:

make -C fuzz
make -C fuzz corpus
./fuzz/build/fuzz-graph-O2

With this I think it's ready for a review now.

Thanks

dstogov reviewed Jun 17, 2026

View reviewed changes

weltling added 6 commits June 20, 2026 14:28

weltling force-pushed the fuzz-graph-prototype branch from d066d04 to 3fce4d5 Compare June 21, 2026 09:44

weltling marked this pull request as ready for review June 21, 2026 09:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fuzz: Add structured graph fuzz harness and fix folding overflows it found#149

fuzz: Add structured graph fuzz harness and fix folding overflows it found#149
weltling wants to merge 6 commits into
dstogov:masterfrom
weltling:fuzz-graph-prototype

weltling commented Jun 15, 2026

Uh oh!

dstogov left a comment

Uh oh!

weltling commented Jun 18, 2026

Uh oh!

weltling commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

weltling commented Jun 15, 2026

Harness

Fixes found

Uh oh!

dstogov left a comment

Choose a reason for hiding this comment

Uh oh!

weltling commented Jun 18, 2026

Uh oh!

weltling commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants