85 lines
3.5 KiB
Markdown
85 lines
3.5 KiB
Markdown
---
|
|
title: LibFuzzer
|
|
---
|
|
|
|
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See
|
|
> [FuzzTest](fuzztest.md) for new fuzzing implementations
|
|
|
|
LibFuzzer is a tool for performing coverage guided fuzzing of C/C++ code. LibFuzzer will try to
|
|
trigger AUBSAN failures in a function you provide, by repeatedly calling it with a carefully crafted
|
|
byte array as input. Each input will be assigned a "score". Byte arrays which exercise new or more
|
|
regions of code will score better. LibFuzzer will merge and mutate high scoring inputs in order to
|
|
gradually cover more and more possible behavior.
|
|
|
|
# When to use LibFuzzer
|
|
|
|
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See
|
|
> [FuzzTest](fuzztest.md) for new fuzzing implementations
|
|
|
|
LibFuzzer is great for testing functions which accept a opaque blob of untrusted user-provided data.
|
|
|
|
# How to use LibFuzzer
|
|
|
|
LibFuzzer implements `int main`, and expects to be linked with an object file which provides the
|
|
function under test. You will achieve this by writing a cpp file which implements
|
|
|
|
```cpp
|
|
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
|
|
// Your code here
|
|
}
|
|
```
|
|
|
|
`LLVMFuzzerTestOneInput` will be called repeatedly, with fuzzer generated bytes in `Data`. `Size`
|
|
will always truthfully tell your implementation how many bytes are in `Data`. If your function
|
|
crashes or induces an AUBSAN fault, LibFuzzer will consider that to be a finding worth reporting.
|
|
|
|
Keep in mind that your function will often "just" be adapting `Data` to whatever format our internal
|
|
C++ functions requires. However, you have a lot of freedom in exactly what you choose to do. Just
|
|
make sure your function crashes or produces an invariant when something interesting happens! As just
|
|
a few ideas:
|
|
|
|
- You might choose to call multiple implementations of a single operation, and validate that they
|
|
produce the same output when presented the same input.
|
|
- You could tease out individual bytes from `Data` and provide them as different arguments to the
|
|
function under test.
|
|
|
|
Finally, your cpp file will need a bazel target. There is a method which defines fuzzer targets,
|
|
much like how we define unittests. For example:
|
|
|
|
```python
|
|
mongo_cc_fuzzer_test(
|
|
name = 'op_msg_fuzzer',
|
|
srcs = [
|
|
'op_msg_fuzzer.cpp',
|
|
],
|
|
hdrs = [
|
|
'op_msg_fuzzer.h',
|
|
]
|
|
deps = [
|
|
'//src/mongo:base',
|
|
':op_msg_fuzzer_fixture',
|
|
],
|
|
)
|
|
```
|
|
|
|
# Running LibFuzzer
|
|
|
|
Your test's object file and **all** of its dependencies must be compiled with the "fuzzer"
|
|
sanitizer, plus a set of sanitizers which might produce interesting runtime errors like AUBSAN.
|
|
Evergreen has a build variant, whose name will include the string "FUZZER", which will compile and
|
|
run all of the fuzzer tests.
|
|
|
|
The fuzzers can be built locally, for development and debugging. Check our Evergreen configuration
|
|
for the current bazel arguments.
|
|
|
|
LibFuzzer binaries will accept a path to a directory containing its "corpus". A corpus is a list of
|
|
examples known to produce interesting outputs. LibFuzzer will start producing interesting results
|
|
more quickly if starts off with a set of inputs which it can begin mutating. When its done, it will
|
|
write down any new inputs it discovered into its corpus. Re-using a corpus across executions is a
|
|
good way to make LibFuzzer return more results in less time. Our Evergreen tasks will try to acquire
|
|
and re-use a corpus from an earlier commit, if it can.
|
|
|
|
# References
|
|
|
|
- [LibFuzzer's official documentation](https://llvm.org/docs/LibFuzzer.html)
|