Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 1 | ======================================================== |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 2 | LibFuzzer -- a library for coverage-guided fuzz testing. |
| 3 | ======================================================== |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 4 | .. contents:: |
| 5 | :local: |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 6 | :depth: 1 |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 7 | |
| 8 | Introduction |
| 9 | ============ |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 10 | |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 11 | libFuzzer -- library for in-process evolutionary fuzzing of other libraries. |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 12 | |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 13 | The typical workflow looks like the following. |
| 14 | First, implement a fuzzing target function, like this:: |
| 15 | |
| 16 | // fuzz_target.cc |
| 17 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { |
| 18 | DoSomethingInterestingWithMyAPI(Data, Size); |
| 19 | return 0; |
| 20 | } |
| 21 | |
| 22 | Next, build the Fuzzer library as a static archive. Note that libFuzzer contains the `main()` function:: |
| 23 | |
| 24 | svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer |
| 25 | clang++ -c -g -O2 -std=c++11 Fuzzer/*.cpp -IFuzzer |
| 26 | ar ruv libFuzzer.a Fuzzer*.o |
| 27 | |
| 28 | Then build the target function and the library you are going to test. |
| 29 | You should use SanitizerCoverage_ and one of ASan, MSan, or UBSan. |
| 30 | Link it with `libFuzzer.a`:: |
| 31 | |
| 32 | clang -fsanitize-coverage=edge -fsanitize=address your_lib.cc fuzz_target.cc libFuzzer.a -o my_fuzzer |
| 33 | |
| 34 | Create a directory with the initial "seed" samlpes. |
| 35 | For some input types libFuzzer will work just fine w/o any seeds, |
| 36 | but for complex inputs this step is very important:: |
| 37 | |
| 38 | mkdir CORPUS_DIR |
| 39 | cp /some/input/samples/* CORPUS_DIR |
| 40 | |
| 41 | Finally, run the fuzzer on the `CORPUS_DIR`:: |
| 42 | |
| 43 | ./my_fuzzer CORPUS_DIR # -max_len=1000 -jobs=20 -more_lags=... |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 44 | |
| 45 | |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 46 | As new interesting test cases are discovered they will be added to the corpus. |
| 47 | If a bug is discovered by the sanitizer (ASan, etc) it will be reported as usual and the reproducer |
| 48 | will be written to disk. |
| 49 | Each Fuzzer process is single-threaded (unless the library starts its own |
| 50 | threads). You can run the libFuzzer on the same corpus in multiple processes |
| 51 | in parallel (use the flags `-jobs=N` and `-workers=N`). |
| 52 | |
| 53 | libFuzzer is similar in concept to AFL_, |
| 54 | but uses in-process Fuzzing, which is more fragile and restrictive, but |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 55 | potentially much faster as it has no overhead for process start-up. |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 56 | It uses LLVM's SanitizerCoverage_ instrumentation to get in-process |
| 57 | coverage-feedback |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 58 | |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 59 | The code resides in the LLVM repository, requires the fresh Clang compiler to build |
| 60 | and is used to fuzz various parts of LLVM, |
| 61 | but the Fuzzer itself does not (and should not) depend on any |
| 62 | part of LLVM and can be used for other projects w/o requiring the rest of LLVM. |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 63 | |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 64 | Usage |
| 65 | ===== |
| 66 | To run fuzzing pass 0 or more directories. New samples will be written into `dir1`, other directories will be read once during startup.:: |
Kostya Serebryany | bfbe7fc | 2016-02-02 03:03:47 +0000 | [diff] [blame] | 67 | |
| 68 | ./fuzzer [-flag1=val1 [-flag2=val2 ...] ] [dir1 [dir2 ...] ] |
| 69 | |
| 70 | To run individual tests without fuzzing pass 1 or more files:: |
| 71 | |
| 72 | ./fuzzer [-flag1=val1 [-flag2=val2 ...] ] file1 [file2 ...] |
| 73 | |
Kostya Serebryany | 2adfa3b | 2015-05-20 21:03:03 +0000 | [diff] [blame] | 74 | The most important flags are:: |
| 75 | |
| 76 | seed 0 Random seed. If 0, seed is generated. |
| 77 | runs -1 Number of individual test runs (-1 for infinite runs). |
Kostya Serebryany | 64d2457 | 2016-03-12 01:57:04 +0000 | [diff] [blame] | 78 | max_len 0 Maximum length of the test input. If 0, libFuzzer tries to guess a good value based on the corpus and reports it. |
Kostya Serebryany | 316b571 | 2015-05-26 20:57:47 +0000 | [diff] [blame] | 79 | timeout 1200 Timeout in seconds (if positive). If one unit runs more than this number of seconds the process will abort. |
Kostya Serebryany | 54a6363 | 2016-01-29 23:30:07 +0000 | [diff] [blame] | 80 | timeout_exitcode 77 Unless abort_on_timeout is set, use this exitcode on timeout. |
Kostya Serebryany | b85db17 | 2015-10-02 20:47:55 +0000 | [diff] [blame] | 81 | max_total_time 0 If positive, indicates the maximal total time in seconds to run the fuzzer. |
Kostya Serebryany | 2adfa3b | 2015-05-20 21:03:03 +0000 | [diff] [blame] | 82 | help 0 Print help. |
Kostya Serebryany | 9cc3b0d | 2015-10-24 01:16:40 +0000 | [diff] [blame] | 83 | merge 0 If 1, the 2-nd, 3-rd, etc corpora will be merged into the 1-st corpus. Only interesting units will be taken. |
Kostya Serebryany | 2adfa3b | 2015-05-20 21:03:03 +0000 | [diff] [blame] | 84 | jobs 0 Number of jobs to run. If jobs >= 1 we spawn this number of jobs in separate worker processes with stdout/stderr redirected to fuzz-JOB.log. |
| 85 | workers 0 Number of simultaneous worker processes to run the jobs. If zero, "min(jobs,NumberOfCpuCores()/2)" is used. |
Kostya Serebryany | b17e298 | 2015-07-31 21:48:10 +0000 | [diff] [blame] | 86 | use_traces 0 Experimental: use instruction traces |
Kostya Serebryany | bc7c0ad | 2015-08-11 01:44:42 +0000 | [diff] [blame] | 87 | only_ascii 0 If 1, generate only ASCII (isprint+isspace) inputs. |
Kostya Serebryany | bd5d1cd | 2015-10-09 03:57:59 +0000 | [diff] [blame] | 88 | artifact_prefix "" Write fuzzing artifacts (crash, timeout, or slow inputs) as $(artifact_prefix)file |
Kostya Serebryany | 2d0ef14 | 2015-11-25 21:40:46 +0000 | [diff] [blame] | 89 | exact_artifact_path "" Write the single artifact on failure (crash, timeout) as $(exact_artifact_path). This overrides -artifact_prefix and will not use checksum in the file name. Do not use the same path for several parallel processes. |
Kostya Serebryany | 3c767db | 2016-02-27 05:45:12 +0000 | [diff] [blame] | 90 | print_final_stats 0 If 1, print statistics at exit. |
Kostya Serebryany | 2adfa3b | 2015-05-20 21:03:03 +0000 | [diff] [blame] | 91 | |
| 92 | For the full list of flags run the fuzzer binary with ``-help=1``. |
| 93 | |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 94 | Usage examples |
| 95 | ============== |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 96 | .. contents:: |
| 97 | :local: |
| 98 | :depth: 1 |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 99 | |
| 100 | Toy example |
| 101 | ----------- |
| 102 | |
| 103 | A simple function that does something interesting if it receives the input "HI!":: |
| 104 | |
| 105 | cat << EOF >> test_fuzzer.cc |
Kostya Serebryany | 1c80b9d | 2015-11-26 00:12:57 +0000 | [diff] [blame] | 106 | #include <stdint.h> |
| 107 | #include <stddef.h> |
| 108 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 109 | if (size > 0 && data[0] == 'H') |
| 110 | if (size > 1 && data[1] == 'I') |
| 111 | if (size > 2 && data[2] == '!') |
| 112 | __builtin_trap(); |
Kostya Serebryany | 20bb5e7 | 2015-10-02 23:34:06 +0000 | [diff] [blame] | 113 | return 0; |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 114 | } |
| 115 | EOF |
Kostya Serebryany | abca88e | 2016-03-12 03:05:37 +0000 | [diff] [blame^] | 116 | # Build test_fuzzer.cc with asan and link against libFuzzer.a |
| 117 | clang++ -fsanitize=address -fsanitize-coverage=edge test_fuzzer.cc libFuzzer.a |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 118 | # Run the fuzzer with no corpus. |
| 119 | ./a.out |
| 120 | |
Kostya Serebryany | abca88e | 2016-03-12 03:05:37 +0000 | [diff] [blame^] | 121 | You should get an error pretty quickly:: |
| 122 | |
| 123 | #0 READ units: 1 exec/s: 0 |
| 124 | #1 INITED cov: 3 units: 1 exec/s: 0 |
| 125 | #2 NEW cov: 5 units: 2 exec/s: 0 L: 64 MS: 0 |
| 126 | #19237 NEW cov: 9 units: 3 exec/s: 0 L: 64 MS: 0 |
| 127 | #20595 NEW cov: 10 units: 4 exec/s: 0 L: 1 MS: 4 ChangeASCIIInt-ShuffleBytes-ChangeByte-CrossOver- |
| 128 | #34574 NEW cov: 13 units: 5 exec/s: 0 L: 2 MS: 3 ShuffleBytes-CrossOver-ChangeBit- |
| 129 | #34807 NEW cov: 15 units: 6 exec/s: 0 L: 3 MS: 1 CrossOver- |
| 130 | ==31511== ERROR: libFuzzer: deadly signal |
| 131 | ... |
| 132 | artifact_prefix='./'; Test unit written to ./crash-b13e8756b13a00cf168300179061fb4b91fefbed |
| 133 | |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 134 | |
| 135 | PCRE2 |
| 136 | ----- |
| 137 | |
Kostya Serebryany | abca88e | 2016-03-12 03:05:37 +0000 | [diff] [blame^] | 138 | Here we show how to use libFuzzer on something real, yet simple: pcre2_:: |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 139 | |
Alexey Samsonov | 21a3381 | 2015-05-07 23:33:24 +0000 | [diff] [blame] | 140 | COV_FLAGS=" -fsanitize-coverage=edge,indirect-calls,8bit-counters" |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 141 | # Get PCRE2 |
| 142 | svn co svn://vcs.exim.org/pcre2/code/trunk pcre |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 143 | # Build PCRE2 with AddressSanitizer and coverage. |
| 144 | (cd pcre; ./autogen.sh; CC="clang -fsanitize=address $COV_FLAGS" ./configure --prefix=`pwd`/../inst && make -j && make install) |
Kostya Serebryany | abca88e | 2016-03-12 03:05:37 +0000 | [diff] [blame^] | 145 | # Build the fuzzing target function that does something interesting with PCRE2. |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 146 | cat << EOF > pcre_fuzzer.cc |
| 147 | #include <string.h> |
Kostya Serebryany | 1c80b9d | 2015-11-26 00:12:57 +0000 | [diff] [blame] | 148 | #include <stdint.h> |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 149 | #include "pcre2posix.h" |
Kostya Serebryany | 1c80b9d | 2015-11-26 00:12:57 +0000 | [diff] [blame] | 150 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { |
Kostya Serebryany | 20bb5e7 | 2015-10-02 23:34:06 +0000 | [diff] [blame] | 151 | if (size < 1) return 0; |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 152 | char *str = new char[size+1]; |
| 153 | memcpy(str, data, size); |
| 154 | str[size] = 0; |
| 155 | regex_t preg; |
| 156 | if (0 == regcomp(&preg, str, 0)) { |
| 157 | regexec(&preg, str, 0, 0, 0); |
| 158 | regfree(&preg); |
| 159 | } |
| 160 | delete [] str; |
Kostya Serebryany | 20bb5e7 | 2015-10-02 23:34:06 +0000 | [diff] [blame] | 161 | return 0; |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 162 | } |
| 163 | EOF |
| 164 | clang++ -g -fsanitize=address $COV_FLAGS -c -std=c++11 -I inst/include/ pcre_fuzzer.cc |
| 165 | # Link. |
Kostya Serebryany | abca88e | 2016-03-12 03:05:37 +0000 | [diff] [blame^] | 166 | clang++ -g -fsanitize=address -Wl,--whole-archive inst/lib/*.a -Wl,-no-whole-archive libFuzzer.a pcre_fuzzer.o -o pcre_fuzzer |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 167 | |
| 168 | This will give you a binary of the fuzzer, called ``pcre_fuzzer``. |
| 169 | Now, create a directory that will hold the test corpus:: |
| 170 | |
| 171 | mkdir -p CORPUS |
| 172 | |
| 173 | For simple input languages like regular expressions this is all you need. |
| 174 | For more complicated inputs populate the directory with some input samples. |
| 175 | Now run the fuzzer with the corpus dir as the only parameter:: |
| 176 | |
| 177 | ./pcre_fuzzer ./CORPUS |
| 178 | |
| 179 | You will see output like this:: |
| 180 | |
| 181 | Seed: 1876794929 |
| 182 | #0 READ cov 0 bits 0 units 1 exec/s 0 |
| 183 | #1 pulse cov 3 bits 0 units 1 exec/s 0 |
| 184 | #1 INITED cov 3 bits 0 units 1 exec/s 0 |
| 185 | #2 pulse cov 208 bits 0 units 1 exec/s 0 |
| 186 | #2 NEW cov 208 bits 0 units 2 exec/s 0 L: 64 |
| 187 | #3 NEW cov 217 bits 0 units 3 exec/s 0 L: 63 |
| 188 | #4 pulse cov 217 bits 0 units 3 exec/s 0 |
| 189 | |
| 190 | * The ``Seed:`` line shows you the current random seed (you can change it with ``-seed=N`` flag). |
| 191 | * The ``READ`` line shows you how many input files were read (since you passed an empty dir there were inputs, but one dummy input was synthesised). |
| 192 | * The ``INITED`` line shows you that how many inputs will be fuzzed. |
| 193 | * The ``NEW`` lines appear with the fuzzer finds a new interesting input, which is saved to the CORPUS dir. If multiple corpus dirs are given, the first one is used. |
| 194 | * The ``pulse`` lines appear periodically to show the current status. |
| 195 | |
| 196 | Now, interrupt the fuzzer and run it again the same way. You will see:: |
| 197 | |
| 198 | Seed: 1879995378 |
| 199 | #0 READ cov 0 bits 0 units 564 exec/s 0 |
| 200 | #1 pulse cov 502 bits 0 units 564 exec/s 0 |
| 201 | ... |
| 202 | #512 pulse cov 2933 bits 0 units 564 exec/s 512 |
| 203 | #564 INITED cov 2991 bits 0 units 344 exec/s 564 |
| 204 | #1024 pulse cov 2991 bits 0 units 344 exec/s 1024 |
| 205 | #1455 NEW cov 2995 bits 0 units 345 exec/s 1455 L: 49 |
| 206 | |
| 207 | This time you were running the fuzzer with a non-empty input corpus (564 items). |
| 208 | As the first step, the fuzzer minimized the set to produce 344 interesting items (the ``INITED`` line) |
| 209 | |
| 210 | You may run ``N`` independent fuzzer jobs in parallel on ``M`` CPUs:: |
| 211 | |
| 212 | N=100; M=4; ./pcre_fuzzer ./CORPUS -jobs=$N -workers=$M |
| 213 | |
Kostya Serebryany | 9690fcf | 2015-05-12 18:51:57 +0000 | [diff] [blame] | 214 | By default (``-reload=1``) the fuzzer processes will periodically scan the CORPUS directory |
| 215 | and reload any new tests. This way the test inputs found by one process will be picked up |
| 216 | by all others. |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 217 | |
Kostya Serebryany | 9690fcf | 2015-05-12 18:51:57 +0000 | [diff] [blame] | 218 | If ``-workers=$M`` is not supplied, ``min($N,NumberOfCpuCore/2)`` will be used. |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 219 | |
Kostya Serebryany | 5e593a4 | 2015-04-08 06:16:11 +0000 | [diff] [blame] | 220 | Heartbleed |
| 221 | ---------- |
| 222 | Remember Heartbleed_? |
| 223 | As it was recently `shown <https://blog.hboeck.de/archives/868-How-Heartbleed-couldve-been-found.html>`_, |
| 224 | fuzzing with AddressSanitizer can find Heartbleed. Indeed, here are the step-by-step instructions |
| 225 | to find Heartbleed with LibFuzzer:: |
| 226 | |
| 227 | wget https://www.openssl.org/source/openssl-1.0.1f.tar.gz |
| 228 | tar xf openssl-1.0.1f.tar.gz |
Alexey Samsonov | 21a3381 | 2015-05-07 23:33:24 +0000 | [diff] [blame] | 229 | COV_FLAGS="-fsanitize-coverage=edge,indirect-calls" # -fsanitize-coverage=8bit-counters |
Kostya Serebryany | 5e593a4 | 2015-04-08 06:16:11 +0000 | [diff] [blame] | 230 | (cd openssl-1.0.1f/ && ./config && |
| 231 | make -j 32 CC="clang -g -fsanitize=address $COV_FLAGS") |
| 232 | # Get and build LibFuzzer |
| 233 | svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer |
| 234 | clang -c -g -O2 -std=c++11 Fuzzer/*.cpp -IFuzzer |
| 235 | # Get examples of key/pem files. |
| 236 | git clone https://github.com/hannob/selftls |
| 237 | cp selftls/server* . -v |
| 238 | cat << EOF > handshake-fuzz.cc |
| 239 | #include <openssl/ssl.h> |
| 240 | #include <openssl/err.h> |
| 241 | #include <assert.h> |
Kostya Serebryany | 1c80b9d | 2015-11-26 00:12:57 +0000 | [diff] [blame] | 242 | #include <stdint.h> |
| 243 | #include <stddef.h> |
| 244 | |
Kostya Serebryany | 5e593a4 | 2015-04-08 06:16:11 +0000 | [diff] [blame] | 245 | SSL_CTX *sctx; |
| 246 | int Init() { |
| 247 | SSL_library_init(); |
| 248 | SSL_load_error_strings(); |
| 249 | ERR_load_BIO_strings(); |
| 250 | OpenSSL_add_all_algorithms(); |
| 251 | assert (sctx = SSL_CTX_new(TLSv1_method())); |
| 252 | assert (SSL_CTX_use_certificate_file(sctx, "server.pem", SSL_FILETYPE_PEM)); |
| 253 | assert (SSL_CTX_use_PrivateKey_file(sctx, "server.key", SSL_FILETYPE_PEM)); |
| 254 | return 0; |
| 255 | } |
Kostya Serebryany | 1c80b9d | 2015-11-26 00:12:57 +0000 | [diff] [blame] | 256 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { |
Kostya Serebryany | 5e593a4 | 2015-04-08 06:16:11 +0000 | [diff] [blame] | 257 | static int unused = Init(); |
| 258 | SSL *server = SSL_new(sctx); |
| 259 | BIO *sinbio = BIO_new(BIO_s_mem()); |
| 260 | BIO *soutbio = BIO_new(BIO_s_mem()); |
| 261 | SSL_set_bio(server, sinbio, soutbio); |
| 262 | SSL_set_accept_state(server); |
| 263 | BIO_write(sinbio, Data, Size); |
| 264 | SSL_do_handshake(server); |
| 265 | SSL_free(server); |
Kostya Serebryany | 20bb5e7 | 2015-10-02 23:34:06 +0000 | [diff] [blame] | 266 | return 0; |
Kostya Serebryany | 5e593a4 | 2015-04-08 06:16:11 +0000 | [diff] [blame] | 267 | } |
| 268 | EOF |
Mehdi Amini | 30618f9 | 2015-09-17 15:59:52 +0000 | [diff] [blame] | 269 | # Build the fuzzer. |
Kostya Serebryany | 5e593a4 | 2015-04-08 06:16:11 +0000 | [diff] [blame] | 270 | clang++ -g handshake-fuzz.cc -fsanitize=address \ |
| 271 | openssl-1.0.1f/libssl.a openssl-1.0.1f/libcrypto.a Fuzzer*.o |
| 272 | # Run 20 independent fuzzer jobs. |
| 273 | ./a.out -jobs=20 -workers=20 |
| 274 | |
| 275 | Voila:: |
| 276 | |
| 277 | #1048576 pulse cov 3424 bits 0 units 9 exec/s 24385 |
| 278 | ================================================================= |
| 279 | ==17488==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000004748 at pc 0x00000048c979 bp 0x7fffe3e864f0 sp 0x7fffe3e85ca8 |
| 280 | READ of size 60731 at 0x629000004748 thread T0 |
| 281 | #0 0x48c978 in __asan_memcpy |
| 282 | #1 0x4db504 in tls1_process_heartbeat openssl-1.0.1f/ssl/t1_lib.c:2586:3 |
| 283 | #2 0x580be3 in ssl3_read_bytes openssl-1.0.1f/ssl/s3_pkt.c:1092:4 |
| 284 | |
Kostya Serebryany | 1c80b9d | 2015-11-26 00:12:57 +0000 | [diff] [blame] | 285 | Note: a `similar fuzzer <https://boringssl.googlesource.com/boringssl/+/HEAD/FUZZING.md>`_ |
| 286 | is now a part of the boringssl source tree. |
| 287 | |
Kostya Serebryany | 043ab1c | 2015-04-01 21:33:20 +0000 | [diff] [blame] | 288 | Advanced features |
| 289 | ================= |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 290 | .. contents:: |
| 291 | :local: |
| 292 | :depth: 1 |
Kostya Serebryany | 043ab1c | 2015-04-01 21:33:20 +0000 | [diff] [blame] | 293 | |
Kostya Serebryany | 7d21166 | 2015-09-04 00:12:11 +0000 | [diff] [blame] | 294 | Dictionaries |
| 295 | ------------ |
| 296 | *EXPERIMENTAL*. |
| 297 | LibFuzzer supports user-supplied dictionaries with input language keywords |
| 298 | or other interesting byte sequences (e.g. multi-byte magic values). |
| 299 | Use ``-dict=DICTIONARY_FILE``. For some input languages using a dictionary |
| 300 | may significantly improve the search speed. |
| 301 | The dictionary syntax is similar to that used by AFL_ for its ``-x`` option:: |
| 302 | |
| 303 | # Lines starting with '#' and empty lines are ignored. |
| 304 | |
| 305 | # Adds "blah" (w/o quotes) to the dictionary. |
| 306 | kw1="blah" |
| 307 | # Use \\ for backslash and \" for quotes. |
| 308 | kw2="\"ac\\dc\"" |
| 309 | # Use \xAB for hex values |
| 310 | kw3="\xF7\xF8" |
| 311 | # the name of the keyword followed by '=' may be omitted: |
| 312 | "foo\x0Abar" |
| 313 | |
Kostya Serebryany | b17e298 | 2015-07-31 21:48:10 +0000 | [diff] [blame] | 314 | Data-flow-guided fuzzing |
| 315 | ------------------------ |
| 316 | |
| 317 | *EXPERIMENTAL*. |
| 318 | With an additional compiler flag ``-fsanitize-coverage=trace-cmp`` (see SanitizerCoverageTraceDataFlow_) |
| 319 | and extra run-time flag ``-use_traces=1`` the fuzzer will try to apply *data-flow-guided fuzzing*. |
| 320 | That is, the fuzzer will record the inputs to comparison instructions, switch statements, |
Kostya Serebryany | 7f4227d | 2015-08-05 18:23:01 +0000 | [diff] [blame] | 321 | and several libc functions (``memcmp``, ``strcmp``, ``strncmp``, etc). |
Kostya Serebryany | b17e298 | 2015-07-31 21:48:10 +0000 | [diff] [blame] | 322 | It will later use those recorded inputs during mutations. |
| 323 | |
| 324 | This mode can be combined with DataFlowSanitizer_ to achieve better sensitivity. |
| 325 | |
Kostya Serebryany | 6bd016b | 2015-04-10 05:44:43 +0000 | [diff] [blame] | 326 | AFL compatibility |
| 327 | ----------------- |
| 328 | LibFuzzer can be used in parallel with AFL_ on the same test corpus. |
| 329 | Both fuzzers expect the test corpus to reside in a directory, one file per input. |
| 330 | You can run both fuzzers on the same corpus in parallel:: |
| 331 | |
| 332 | ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program -r @@ |
| 333 | ./llvm-fuzz testcase_dir findings_dir # Will write new tests to testcase_dir |
| 334 | |
| 335 | Periodically restart both fuzzers so that they can use each other's findings. |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 336 | |
Kostya Serebryany | cd073d5 | 2015-04-10 06:32:29 +0000 | [diff] [blame] | 337 | How good is my fuzzer? |
| 338 | ---------------------- |
| 339 | |
Kostya Serebryany | 566bc5a | 2015-05-06 22:19:00 +0000 | [diff] [blame] | 340 | Once you implement your target function ``LLVMFuzzerTestOneInput`` and fuzz it to death, |
Kostya Serebryany | cd073d5 | 2015-04-10 06:32:29 +0000 | [diff] [blame] | 341 | you will want to know whether the function or the corpus can be improved further. |
| 342 | One easy to use metric is, of course, code coverage. |
| 343 | You can get the coverage for your corpus like this:: |
| 344 | |
| 345 | ASAN_OPTIONS=coverage_pcs=1 ./fuzzer CORPUS_DIR -runs=0 |
| 346 | |
| 347 | This will run all the tests in the CORPUS_DIR but will not generate any new tests |
| 348 | and dump covered PCs to disk before exiting. |
| 349 | Then you can subtract the set of covered PCs from the set of all instrumented PCs in the binary, |
| 350 | see SanitizerCoverage_ for details. |
| 351 | |
Kostya Serebryany | 926b9bd | 2015-05-22 22:43:05 +0000 | [diff] [blame] | 352 | User-supplied mutators |
| 353 | ---------------------- |
| 354 | |
| 355 | LibFuzzer allows to use custom (user-supplied) mutators, |
| 356 | see FuzzerInterface.h_ |
| 357 | |
Kostya Serebryany | aca7696 | 2016-01-16 01:23:12 +0000 | [diff] [blame] | 358 | Startup initialization |
| 359 | ---------------------- |
| 360 | If the library being tested needs to be initialized, there are several options. |
| 361 | |
| 362 | The simplest way is to have a statically initialized global object:: |
| 363 | |
| 364 | static bool Initialized = DoInitialization(); |
| 365 | |
| 366 | Alternatively, you may define an optional init function and it will receive |
| 367 | the program arguments that you can read and modify:: |
| 368 | |
| 369 | extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) { |
| 370 | ReadAndMaybeModify(argc, argv); |
| 371 | return 0; |
| 372 | } |
| 373 | |
| 374 | Finally, you may use your own ``main()`` and call ``FuzzerDriver`` |
| 375 | from there, see FuzzerInterface.h_. |
| 376 | |
| 377 | Try to avoid initialization inside the target function itself as |
| 378 | it will skew the coverage data. Don't do this:: |
| 379 | |
| 380 | extern "C" int LLVMFuzzerTestOneInput(...) { |
| 381 | static bool initialized = false; |
| 382 | if (!initialized) { |
| 383 | ... |
| 384 | } |
| 385 | } |
| 386 | |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 387 | Fuzzing components of LLVM |
| 388 | ========================== |
Kostya Serebryany | d11dc17 | 2016-03-12 02:56:25 +0000 | [diff] [blame] | 389 | .. contents:: |
| 390 | :local: |
| 391 | :depth: 1 |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 392 | |
| 393 | clang-format-fuzzer |
| 394 | ------------------- |
| 395 | The inputs are random pieces of C++-like text. |
| 396 | |
| 397 | Build (make sure to use fresh clang as the host compiler):: |
| 398 | |
| 399 | cmake -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=YES -DCMAKE_BUILD_TYPE=Release /path/to/llvm |
| 400 | ninja clang-format-fuzzer |
| 401 | mkdir CORPUS_DIR |
| 402 | ./bin/clang-format-fuzzer CORPUS_DIR |
| 403 | |
| 404 | Optionally build other kinds of binaries (asan+Debug, msan, ubsan, etc). |
| 405 | |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 406 | Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23052 |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 407 | |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 408 | clang-fuzzer |
| 409 | ------------ |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 410 | |
Kostya Serebryany | 866e0d1 | 2015-09-02 22:44:46 +0000 | [diff] [blame] | 411 | The behavior is very similar to ``clang-format-fuzzer``. |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 412 | |
| 413 | Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23057 |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 414 | |
Kostya Serebryany | b98e327 | 2015-08-31 18:57:24 +0000 | [diff] [blame] | 415 | llvm-as-fuzzer |
| 416 | -------------- |
| 417 | |
| 418 | Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=24639 |
| 419 | |
Daniel Sanders | 5151b20 | 2015-09-18 10:47:45 +0000 | [diff] [blame] | 420 | llvm-mc-fuzzer |
| 421 | -------------- |
| 422 | |
| 423 | This tool fuzzes the MC layer. Currently it is only able to fuzz the |
| 424 | disassembler but it is hoped that assembly, and round-trip verification will be |
| 425 | added in future. |
| 426 | |
| 427 | When run in dissassembly mode, the inputs are opcodes to be disassembled. The |
| 428 | fuzzer will consume as many instructions as possible and will stop when it |
| 429 | finds an invalid instruction or runs out of data. |
| 430 | |
Daniel Sanders | 4fe1c8b | 2015-09-26 17:09:01 +0000 | [diff] [blame] | 431 | Please note that the command line interface differs slightly from that of other |
| 432 | fuzzers. The fuzzer arguments should follow ``--fuzzer-args`` and should have |
| 433 | a single dash, while other arguments control the operation mode and target in a |
| 434 | similar manner to ``llvm-mc`` and should have two dashes. For example:: |
Daniel Sanders | 5151b20 | 2015-09-18 10:47:45 +0000 | [diff] [blame] | 435 | |
Daniel Sanders | 4fe1c8b | 2015-09-26 17:09:01 +0000 | [diff] [blame] | 436 | llvm-mc-fuzzer --triple=aarch64-linux-gnu --disassemble --fuzzer-args -max_len=4 -jobs=10 |
Daniel Sanders | 5151b20 | 2015-09-18 10:47:45 +0000 | [diff] [blame] | 437 | |
Kostya Serebryany | fb2f331 | 2015-05-13 22:42:28 +0000 | [diff] [blame] | 438 | Buildbot |
| 439 | -------- |
| 440 | |
| 441 | We have a buildbot that runs the above fuzzers for LLVM components |
| 442 | 24/7/365 at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer . |
| 443 | |
| 444 | Pre-fuzzed test inputs in git |
| 445 | ----------------------------- |
| 446 | |
| 447 | The buildbot occumulates large test corpuses over time. |
| 448 | The corpuses are stored in git on github and can be used like this:: |
| 449 | |
| 450 | git clone https://github.com/kcc/fuzzing-with-sanitizers.git |
| 451 | bin/clang-format-fuzzer fuzzing-with-sanitizers/llvm/clang-format/C1 |
| 452 | bin/clang-fuzzer fuzzing-with-sanitizers/llvm/clang/C1/ |
Kostya Serebryany | b98e327 | 2015-08-31 18:57:24 +0000 | [diff] [blame] | 453 | bin/llvm-as-fuzzer fuzzing-with-sanitizers/llvm/llvm-as/C1 -only_ascii=1 |
Kostya Serebryany | fb2f331 | 2015-05-13 22:42:28 +0000 | [diff] [blame] | 454 | |
| 455 | |
Kostya Serebryany | 35ce863 | 2015-03-30 23:05:30 +0000 | [diff] [blame] | 456 | FAQ |
| 457 | ========================= |
| 458 | |
| 459 | Q. Why Fuzzer does not use any of the LLVM support? |
| 460 | --------------------------------------------------- |
| 461 | |
| 462 | There are two reasons. |
| 463 | |
| 464 | First, we want this library to be used outside of the LLVM w/o users having to |
| 465 | build the rest of LLVM. This may sound unconvincing for many LLVM folks, |
| 466 | but in practice the need for building the whole LLVM frightens many potential |
| 467 | users -- and we want more users to use this code. |
| 468 | |
| 469 | Second, there is a subtle technical reason not to rely on the rest of LLVM, or |
| 470 | any other large body of code (maybe not even STL). When coverage instrumentation |
| 471 | is enabled, it will also instrument the LLVM support code which will blow up the |
| 472 | coverage set of the process (since the fuzzer is in-process). In other words, by |
| 473 | using more external dependencies we will slow down the fuzzer while the main |
| 474 | reason for it to exist is extreme speed. |
| 475 | |
| 476 | Q. What about Windows then? The Fuzzer contains code that does not build on Windows. |
| 477 | ------------------------------------------------------------------------------------ |
| 478 | |
| 479 | The sanitizer coverage support does not work on Windows either as of 01/2015. |
| 480 | Once it's there, we'll need to re-implement OS-specific parts (I/O, signals). |
| 481 | |
| 482 | Q. When this Fuzzer is not a good solution for a problem? |
| 483 | --------------------------------------------------------- |
| 484 | |
| 485 | * If the test inputs are validated by the target library and the validator |
| 486 | asserts/crashes on invalid inputs, the in-process fuzzer is not applicable |
| 487 | (we could use fork() w/o exec, but it comes with extra overhead). |
| 488 | * Bugs in the target library may accumulate w/o being detected. E.g. a memory |
| 489 | corruption that goes undetected at first and then leads to a crash while |
| 490 | testing another input. This is why it is highly recommended to run this |
| 491 | in-process fuzzer with all sanitizers to detect most bugs on the spot. |
| 492 | * It is harder to protect the in-process fuzzer from excessive memory |
| 493 | consumption and infinite loops in the target library (still possible). |
| 494 | * The target library should not have significant global state that is not |
| 495 | reset between the runs. |
| 496 | * Many interesting target libs are not designed in a way that supports |
| 497 | the in-process fuzzer interface (e.g. require a file path instead of a |
| 498 | byte array). |
| 499 | * If a single test run takes a considerable fraction of a second (or |
| 500 | more) the speed benefit from the in-process fuzzer is negligible. |
| 501 | * If the target library runs persistent threads (that outlive |
| 502 | execution of one test) the fuzzing results will be unreliable. |
| 503 | |
| 504 | Q. So, what exactly this Fuzzer is good for? |
| 505 | -------------------------------------------- |
| 506 | |
| 507 | This Fuzzer might be a good choice for testing libraries that have relatively |
| 508 | small inputs, each input takes < 1ms to run, and the library code is not expected |
| 509 | to crash on invalid inputs. |
| 510 | Examples: regular expression matchers, text or binary format parsers. |
| 511 | |
Kostya Serebryany | fab4fba | 2015-08-11 01:53:45 +0000 | [diff] [blame] | 512 | Trophies |
| 513 | ======== |
| 514 | * GLIBC: https://sourceware.org/glibc/wiki/FuzzingLibc |
Kostya Serebryany | fdf4418 | 2015-08-11 04:16:37 +0000 | [diff] [blame] | 515 | |
Kostya Serebryany | fab4fba | 2015-08-11 01:53:45 +0000 | [diff] [blame] | 516 | * MUSL LIBC: |
Kostya Serebryany | fdf4418 | 2015-08-11 04:16:37 +0000 | [diff] [blame] | 517 | |
| 518 | * http://git.musl-libc.org/cgit/musl/commit/?id=39dfd58417ef642307d90306e1c7e50aaec5a35c |
| 519 | * http://www.openwall.com/lists/oss-security/2015/03/30/3 |
| 520 | |
Kostya Serebryany | 928eb33 | 2015-10-12 18:15:42 +0000 | [diff] [blame] | 521 | * `pugixml <https://github.com/zeux/pugixml/issues/39>`_ |
Kostya Serebryany | fdf4418 | 2015-08-11 04:16:37 +0000 | [diff] [blame] | 522 | |
Kostya Serebryany | 45dac2a | 2015-10-10 02:14:18 +0000 | [diff] [blame] | 523 | * PCRE: Search for "LLVM fuzzer" in http://vcs.pcre.org/pcre2/code/trunk/ChangeLog?view=markup; |
Kostya Serebryany | 928eb33 | 2015-10-12 18:15:42 +0000 | [diff] [blame] | 524 | also in `bugzilla <https://bugs.exim.org/buglist.cgi?bug_status=__all__&content=libfuzzer&no_redirect=1&order=Importance&product=PCRE&query_format=specific>`_ |
Kostya Serebryany | fdf4418 | 2015-08-11 04:16:37 +0000 | [diff] [blame] | 525 | |
Kostya Serebryany | 928eb33 | 2015-10-12 18:15:42 +0000 | [diff] [blame] | 526 | * `ICU <http://bugs.icu-project.org/trac/ticket/11838>`_ |
Kostya Serebryany | ed48377 | 2015-08-11 20:34:48 +0000 | [diff] [blame] | 527 | |
Kostya Serebryany | 928eb33 | 2015-10-12 18:15:42 +0000 | [diff] [blame] | 528 | * `Freetype <https://savannah.nongnu.org/search/?words=LibFuzzer&type_of_search=bugs&Search=Search&exact=1#options>`_ |
Kostya Serebryany | 6292128 | 2015-09-11 16:34:14 +0000 | [diff] [blame] | 529 | |
Kostya Serebryany | 928eb33 | 2015-10-12 18:15:42 +0000 | [diff] [blame] | 530 | * `Harfbuzz <https://github.com/behdad/harfbuzz/issues/139>`_ |
| 531 | |
Kostya Serebryany | 240a159 | 2015-11-11 05:25:24 +0000 | [diff] [blame] | 532 | * `SQLite <http://www3.sqlite.org/cgi/src/info/088009efdd56160b>`_ |
Kostya Serebryany | 65e7126 | 2015-11-11 05:20:55 +0000 | [diff] [blame] | 533 | |
Kostya Serebryany | 12fa3b5 | 2015-11-13 02:44:16 +0000 | [diff] [blame] | 534 | * `Python <http://bugs.python.org/issue25388>`_ |
| 535 | |
Kostya Serebryany | 721f61a | 2016-03-02 19:45:10 +0000 | [diff] [blame] | 536 | * OpenSSL/BoringSSL: `[1] <https://boringssl.googlesource.com/boringssl/+/cb852981cd61733a7a1ae4fd8755b7ff950e857d>`_ `[2] <https://openssl.org/news/secadv/20160301.txt>`_ `[3] <https://boringssl.googlesource.com/boringssl/+/2b07fa4b22198ac02e0cee8f37f3337c3dba91bc>`_ |
Kostya Serebryany | 064a672 | 2015-12-05 02:23:49 +0000 | [diff] [blame] | 537 | |
Kostya Serebryany | 928eb33 | 2015-10-12 18:15:42 +0000 | [diff] [blame] | 538 | * `Libxml2 |
| 539 | <https://bugzilla.gnome.org/buglist.cgi?bug_status=__all__&content=libFuzzer&list_id=68957&order=Importance&product=libxml2&query_format=specific>`_ |
Kostya Serebryany | 45dac2a | 2015-10-10 02:14:18 +0000 | [diff] [blame] | 540 | |
Kostya Serebryany | 240a159 | 2015-11-11 05:25:24 +0000 | [diff] [blame] | 541 | * `Linux Kernel's BPF verifier <https://github.com/iovisor/bpf-fuzzer>`_ |
Kostya Serebryany | 6292128 | 2015-09-11 16:34:14 +0000 | [diff] [blame] | 542 | |
Kostya Serebryany | 240a159 | 2015-11-11 05:25:24 +0000 | [diff] [blame] | 543 | * LLVM: `Clang <https://llvm.org/bugs/show_bug.cgi?id=23057>`_, `Clang-format <https://llvm.org/bugs/show_bug.cgi?id=23052>`_, `libc++ <https://llvm.org/bugs/show_bug.cgi?id=24411>`_, `llvm-as <https://llvm.org/bugs/show_bug.cgi?id=24639>`_, Disassembler: http://reviews.llvm.org/rL247405, http://reviews.llvm.org/rL247414, http://reviews.llvm.org/rL247416, http://reviews.llvm.org/rL247417, http://reviews.llvm.org/rL247420, http://reviews.llvm.org/rL247422. |
Kostya Serebryany | fab4fba | 2015-08-11 01:53:45 +0000 | [diff] [blame] | 544 | |
Kostya Serebryany | 7967738 | 2015-03-31 21:39:38 +0000 | [diff] [blame] | 545 | .. _pcre2: http://www.pcre.org/ |
| 546 | |
| 547 | .. _AFL: http://lcamtuf.coredump.cx/afl/ |
| 548 | |
Alexey Samsonov | 675e539 | 2015-04-27 22:50:06 +0000 | [diff] [blame] | 549 | .. _SanitizerCoverage: http://clang.llvm.org/docs/SanitizerCoverage.html |
Kostya Serebryany | b17e298 | 2015-07-31 21:48:10 +0000 | [diff] [blame] | 550 | .. _SanitizerCoverageTraceDataFlow: http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow |
| 551 | .. _DataFlowSanitizer: http://clang.llvm.org/docs/DataFlowSanitizer.html |
Kostya Serebryany | 5e593a4 | 2015-04-08 06:16:11 +0000 | [diff] [blame] | 552 | |
| 553 | .. _Heartbleed: http://en.wikipedia.org/wiki/Heartbleed |
Kostya Serebryany | 926b9bd | 2015-05-22 22:43:05 +0000 | [diff] [blame] | 554 | |
| 555 | .. _FuzzerInterface.h: https://github.com/llvm-mirror/llvm/blob/master/lib/Fuzzer/FuzzerInterface.h |