mongo/docs/libfuzzer.md
Steve McClure 32e8f260de SERVER-124136 Format markdown via prettier: wrap lines and use width of 100 (#52231)
GitOrigin-RevId: 3305c1e2ee3a6a2c3a5b2b7883b0f491a59ed646
2026-04-21 19:20:11 +00:00

85 lines
3.5 KiB
Markdown

---
title: LibFuzzer
---
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See
> [FuzzTest](fuzztest.md) for new fuzzing implementations
LibFuzzer is a tool for performing coverage guided fuzzing of C/C++ code. LibFuzzer will try to
trigger AUBSAN failures in a function you provide, by repeatedly calling it with a carefully crafted
byte array as input. Each input will be assigned a "score". Byte arrays which exercise new or more
regions of code will score better. LibFuzzer will merge and mutate high scoring inputs in order to
gradually cover more and more possible behavior.
# When to use LibFuzzer
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See
> [FuzzTest](fuzztest.md) for new fuzzing implementations
LibFuzzer is great for testing functions which accept a opaque blob of untrusted user-provided data.
# How to use LibFuzzer
LibFuzzer implements `int main`, and expects to be linked with an object file which provides the
function under test. You will achieve this by writing a cpp file which implements
```cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
// Your code here
}
```
`LLVMFuzzerTestOneInput` will be called repeatedly, with fuzzer generated bytes in `Data`. `Size`
will always truthfully tell your implementation how many bytes are in `Data`. If your function
crashes or induces an AUBSAN fault, LibFuzzer will consider that to be a finding worth reporting.
Keep in mind that your function will often "just" be adapting `Data` to whatever format our internal
C++ functions requires. However, you have a lot of freedom in exactly what you choose to do. Just
make sure your function crashes or produces an invariant when something interesting happens! As just
a few ideas:
- You might choose to call multiple implementations of a single operation, and validate that they
produce the same output when presented the same input.
- You could tease out individual bytes from `Data` and provide them as different arguments to the
function under test.
Finally, your cpp file will need a bazel target. There is a method which defines fuzzer targets,
much like how we define unittests. For example:
```python
mongo_cc_fuzzer_test(
name = 'op_msg_fuzzer',
srcs = [
'op_msg_fuzzer.cpp',
],
hdrs = [
'op_msg_fuzzer.h',
]
deps = [
'//src/mongo:base',
':op_msg_fuzzer_fixture',
],
)
```
# Running LibFuzzer
Your test's object file and **all** of its dependencies must be compiled with the "fuzzer"
sanitizer, plus a set of sanitizers which might produce interesting runtime errors like AUBSAN.
Evergreen has a build variant, whose name will include the string "FUZZER", which will compile and
run all of the fuzzer tests.
The fuzzers can be built locally, for development and debugging. Check our Evergreen configuration
for the current bazel arguments.
LibFuzzer binaries will accept a path to a directory containing its "corpus". A corpus is a list of
examples known to produce interesting outputs. LibFuzzer will start producing interesting results
more quickly if starts off with a set of inputs which it can begin mutating. When its done, it will
write down any new inputs it discovered into its corpus. Re-using a corpus across executions is a
good way to make LibFuzzer return more results in less time. Our Evergreen tasks will try to acquire
and re-use a corpus from an earlier commit, if it can.
# References
- [LibFuzzer's official documentation](https://llvm.org/docs/LibFuzzer.html)