Notebook

Mutation-Based Fuzzing¶

Most randomly generated inputs are syntactically invalid and thus are quickly rejected by the processing program. To exercise functionality beyond input processing, we must increase chances to obtain valid inputs. One such way is so-called mutational fuzzing – that is, introducing small changes to existing inputs that may still keep the input valid, yet exercise new behavior. We show how to create such mutations, and how to guide them towards yet uncovered code, applying central concepts from the popular AFL fuzzer.

In [1]:

from bookutils import YouTubeVideo
YouTubeVideo('5ROhc_42jQU')

Out[1]:

Prerequisites

You should know how basic fuzzing works; for instance, from the "Fuzzing" chapter.
You should understand the basics of obtaining coverage.

Synopsis¶

To use the code provided in this chapter, write

>>> from fuzzingbook.MutationFuzzer import <identifier>

and then make use of the following features.

This chapter introduces a MutationFuzzer class that takes a list of seed inputs which are then mutated:

>>> seed_input = "http://www.google.com/search?q=fuzzing"
>>> mutation_fuzzer = MutationFuzzer(seed=[seed_input])
>>> [mutation_fuzzer.fuzz() for i in range(10)]
['http://www.google.com/search?q=fuzzing',
 'http://wwBw.google.com/searh?q=fuzzing',
 'http8//wswgoRogle.am/secch?qU=fuzzing',
 'ittp://www.googLe.com/serch?q=fuzzingZ',
 'httP://wgw.google.com/seasch?Q=fuxzanmgY',
 'http://www.google.cxcom/search?q=fuzzing',
 'hFttp://ww.-g\x7fog+le.com/s%arch?q=f-uzz#ing',
 'http://www\x0egoogle.com/seaNrch?q=fuZzing',
 'http//www.Ygooge.comsarch?q=fuz~Ijg',
 'http8//ww.goog5le.com/sezarc?q=fuzzing']

The MutationCoverageFuzzer maintains a population of inputs, which are then evolved in order to maximize coverage.

>>> mutation_fuzzer = MutationCoverageFuzzer(seed=[seed_input])
>>> mutation_fuzzer.runs(http_runner, trials=10000)
>>> mutation_fuzzer.population[:5]
['http://www.google.com/search?q=fuzzing',
 'http://wwv.oogle>co/search7Eq=fuzing',
 'http://wwv\x0eOogleb>co/seakh7Eq\x1d;fuzing',
 'http://wwv\x0eoglebkooqeakh7Eq\x1d;fuzing',
 'http://wwv\x0eoglekol=oekh7Eq\x1d\x1bf~ing']

Fuzzing with Mutations¶

On November 2013, the first version of American Fuzzy Lop (AFL) was released. Since then, AFL has become one of the most successful fuzzing tools and comes in many flavors, e.g., AFLFast, AFLGo, and AFLSmart (which are discussed in this book). AFL has made fuzzing a popular choice for automated vulnerability detection. It was the first to demonstrate that vulnerabilities can be detected automatically at a large scale in many security-critical, real-world applications.

American Fuzzy Lop Command Line User Interface

Figure 1. American Fuzzy Lop Command Line User Interface

In this chapter, we are going to introduce the basics of mutational fuzz testing; the next chapter will then further show how to direct fuzzing towards specific code goals.

Fuzzing a URL Parser¶

Many programs expect their inputs to come in a very specific format before they would actually process them. As an example, think of a program that accepts a URL (a Web address). The URL has to be in a valid format (i.e., the URL format) such that the program can deal with it. When fuzzing with random inputs, what are our chances to actually produce a valid URL?

To get deeper into the problem, let us explore what URLs are made of. A URL consists of a number of elements:

scheme://netloc/path?query#fragment

where

scheme is the protocol to be used, including http, https, ftp, file...
netloc is the name of the host to connect to, such as www.google.com
path is the path on that very host, such as search
query is a list of key/value pairs, such as q=fuzzing
fragment is a marker for a location in the retrieved document, such as #result

In Python, we can use the urlparse() function to parse and decompose a URL into its parts.

In [2]:

import bookutils.setup

In [3]:

from typing import Tuple, List, Callable, Set, Any

In [4]:

from urllib.parse import urlparse

In [5]:

urlparse("http://www.google.com/search?q=fuzzing")

Out[5]:

ParseResult(scheme='http', netloc='www.google.com', path='/search', params='', query='q=fuzzing', fragment='')

We see how the result encodes the individual parts of the URL in different attributes.

Let us now assume we have a program that takes a URL as input. To simplify things, we won't let it do very much; we simply have it check the passed URL for validity. If the URL is valid, it returns True; otherwise, it raises an exception.

In [6]:

def http_program(url: str) -> bool:
    supported_schemes = ["http", "https"]
    result = urlparse(url)
    if result.scheme not in supported_schemes:
        raise ValueError("Scheme must be one of " + 
                         repr(supported_schemes))
    if result.netloc == '':
        raise ValueError("Host must be non-empty")

    # Do something with the URL
    return True

Let us now go and fuzz http_program(). To fuzz, we use the full range of printable ASCII characters, such that :, /, and lowercase letters are included.

In [7]:

from Fuzzer import fuzzer

In [8]:

fuzzer(char_start=32, char_range=96)

Out[8]:

'"N&+slk%h\x7fyp5o\'@[3(rW*M5W]tMFPU4\\P@tz%[X?uo\\1?b4T;1bDeYtHx #UJ5w}pMmPodJM,_'

Let's try to fuzz with 1000 random inputs and see whether we have some success.

In [9]:

for i in range(1000):
    try:
        url = fuzzer()
        result = http_program(url)
        print("Success!")
    except ValueError:
        pass

What are the chances of actually getting a valid URL? We need our string to start with "http://" or "https://". Let's take the "http://" case first. These are seven very specific characters we need to start with. The chance of producing these seven characters randomly (with a character range of 96 different characters) is $1 : 96^7$, or

In [10]:

96 ** 7

Out[10]:

75144747810816

The odds of producing a "https://" prefix are even worse, at $1 : 96^8$:

In [11]:

96 ** 8

Out[11]:

7213895789838336

which gives us a total chance of

In [12]:

likelihood = 1 / (96 ** 7) + 1 / (96 ** 8)
likelihood

Out[12]:

1.344627131107667e-14

And this is the number of runs (on average) we'd need to produce a valid URL scheme:

In [13]:

1 / likelihood

Out[13]:

74370059689055.02

Let's measure how long one run of http_program() takes:

In [14]:

from Timer import Timer

In [15]:

trials = 1000
with Timer() as t:
    for i in range(trials):
        try:
            url = fuzzer()
            result = http_program(url)
            print("Success!")
        except ValueError:
            pass

duration_per_run_in_seconds = t.elapsed_time() / trials
duration_per_run_in_seconds

Out[15]:

2.451037500577513e-05

That's pretty fast, isn't it? Unfortunately, we have a lot of runs to cover.

In [16]:

seconds_until_success = duration_per_run_in_seconds * (1 / likelihood)
seconds_until_success

Out[16]:

1822838052.1806188

which translates into

In [17]:

hours_until_success = seconds_until_success / 3600
days_until_success = hours_until_success / 24
years_until_success = days_until_success / 365.25
years_until_success

Out[17]:

57.76225226825294

Even if we parallelize things a lot, we're still in for months to years of waiting. And that's for getting one successful run that will get deeper into http_program().

What basic fuzzing will do well is to test urlparse(), and if there is an error in this parsing function, it has good chances of uncovering it. But as long as we cannot produce a valid input, we are out of luck in reaching any deeper functionality.

Mutating Inputs¶

The alternative to generating random strings from scratch is to start with a given valid input, and then to subsequently mutate it. A mutation in this context is a simple string manipulation - say, inserting a (random) character, deleting a character, or flipping a bit in a character representation. This is called mutational fuzzing – in contrast to the generational fuzzing techniques discussed earlier.

Here are some mutations to get you started:

In [18]:

import random

In [19]:

def delete_random_character(s: str) -> str:
    """Returns s with a random character deleted"""
    if s == "":
        return s

    pos = random.randint(0, len(s) - 1)
    # print("Deleting", repr(s[pos]), "at", pos)
    return s[:pos] + s[pos + 1:]

In [20]:

seed_input = "A quick brown fox"
for i in range(10):
    x = delete_random_character(seed_input)
    print(repr(x))

'A uick brown fox'
'A quic brown fox'
'A quick brown fo'
'A quic brown fox'
'A quick bown fox'
'A quick bown fox'
'A quick brown fx'
'A quick brown ox'
'A quick brow fox'
'A quic brown fox'

In [21]:

def insert_random_character(s: str) -> str:
    """Returns s with a random character inserted"""
    pos = random.randint(0, len(s))
    random_character = chr(random.randrange(32, 127))
    # print("Inserting", repr(random_character), "at", pos)
    return s[:pos] + random_character + s[pos:]

In [22]:

for i in range(10):
    print(repr(insert_random_character(seed_input)))

'A quick brvown fox'
'A quwick brown fox'
'A qBuick brown fox'
'A quick broSwn fox'
'A quick brown fvox'
'A quick brown 3fox'
'A quick brNown fox'
'A quick brow4n fox'
'A quick brown fox8'
'A equick brown fox'

In [23]:

def flip_random_character(s):
    """Returns s with a random bit flipped in a random position"""
    if s == "":
        return s

    pos = random.randint(0, len(s) - 1)
    c = s[pos]
    bit = 1 << random.randint(0, 6)
    new_c = chr(ord(c) ^ bit)
    # print("Flipping", bit, "in", repr(c) + ", giving", repr(new_c))
    return s[:pos] + new_c + s[pos + 1:]

In [24]:

for i in range(10):
    print(repr(flip_random_character(seed_input)))

'A quick bRown fox'
'A quici brown fox'
'A"quick brown fox'
'A quick brown$fox'
'A quick bpown fox'
'A quick brown!fox'
'A 1uick brown fox'
'@ quick brown fox'
'A quic+ brown fox'
'A quick bsown fox'

Let us now create a random mutator that randomly chooses which mutation to apply:

In [25]:

def mutate(s: str) -> str:
    """Return s with a random mutation applied"""
    mutators = [
        delete_random_character,
        insert_random_character,
        flip_random_character
    ]
    mutator = random.choice(mutators)
    # print(mutator)
    return mutator(s)

In [26]:

for i in range(10):
    print(repr(mutate("A quick brown fox")))

'A qzuick brown fox'
' quick brown fox'
'A quick Brown fox'
'A qMuick brown fox'
'A qu_ick brown fox'
'A quick bXrown fox'
'A quick brown fx'
'A quick!brown fox'
'A! quick brown fox'
'A quick brownfox'

The idea is now that if we have some valid input(s) to begin with, we may create more input candidates by applying one of the above mutations. To see how this works, let's get back to URLs.

Mutating URLs¶

Let us now get back to our URL parsing problem. Let us create a function is_valid_url() that checks whether http_program() accepts the input.

In [27]:

def is_valid_url(url: str) -> bool:
    try:
        result = http_program(url)
        return True
    except ValueError:
        return False

In [28]:

assert is_valid_url("http://www.google.com/search?q=fuzzing")
assert not is_valid_url("xyzzy")

Let us now apply the mutate() function on a given URL and see how many valid inputs we obtain.

In [29]:

seed_input = "http://www.google.com/search?q=fuzzing"
valid_inputs = set()
trials = 20

for i in range(trials):
    inp = mutate(seed_input)
    if is_valid_url(inp):
        valid_inputs.add(inp)

We can now observe that by mutating the original input, we get a high proportion of valid inputs:

In [30]:

len(valid_inputs) / trials

Out[30]:

0.8

What are the odds of also producing a https: prefix by mutating a http: sample seed input? We have to insert ($1 : 3$) the right character 's' ($1 : 96$) into the correct position ($1 : l$), where $l$ is the length of our seed input. This means that on average, we need this many runs:

In [31]:

trials = 3 * 96 * len(seed_input)
trials

Out[31]:

We can actually afford this. Let's try:

In [32]:

from Timer import Timer

In [33]:

trials = 0
with Timer() as t:
    while True:
        trials += 1
        inp = mutate(seed_input)
        if inp.startswith("https://"):
            print(
                "Success after",
                trials,
                "trials in",
                t.elapsed_time(),
                "seconds")
            break

Success after 3656 trials in 0.0055189170088851824 seconds

Of course, if we wanted to get, say, an "ftp://" prefix, we would need more mutations and more runs – most important, though, we would need to apply multiple mutations.

Multiple Mutations¶

So far, we have only applied one single mutation on a sample string. However, we can also apply multiple mutations, further changing it. What happens, for instance, if we apply, say, 20 mutations on our sample string?

In [34]:

seed_input = "http://www.google.com/search?q=fuzzing"
mutations = 50

In [35]:

inp = seed_input
for i in range(mutations):
    if i % 5 == 0:
        print(i, "mutations:", repr(inp))
    inp = mutate(inp)

0 mutations: 'http://www.google.com/search?q=fuzzing'
5 mutations: 'http:/L/www.googlej.com/seaRchq=fuz:ing'
10 mutations: 'http:/L/www.ggoWglej.com/seaRchqfu:in'
15 mutations: 'http:/L/wwggoWglej.com/seaR3hqf,u:in'
20 mutations: 'htt://wwggoVgle"j.som/seaR3hqf,u:in'
25 mutations: 'htt://fwggoVgle"j.som/eaRd3hqf,u^:in'
30 mutations: 'htv://>fwggoVgle"j.qom/ea0Rd3hqf,u^:i'
35 mutations: 'htv://>fwggozVle"Bj.qom/eapRd[3hqf,u^:i'
40 mutations: 'htv://>fwgeo6zTle"Bj.\'qom/eapRd[3hqf,tu^:i'
45 mutations: 'htv://>fwgeo]6zTle"BjM.\'qom/eaR[3hqf,tu^:i'

As you see, the original seed input is hardly recognizable anymore. By mutating the input again and again, we get a higher variety in the input.

To implement such multiple mutations in a single package, let us introduce a MutationFuzzer class. It takes a seed (a list of strings) as well as a minimum and a maximum number of mutations.

In [36]:

from Fuzzer import Fuzzer

In [37]:

class MutationFuzzer(Fuzzer):
    """Base class for mutational fuzzing"""

    def __init__(self, seed: List[str],
                 min_mutations: int = 2,
                 max_mutations: int = 10) -> None:
        """Constructor.
        `seed` - a list of (input) strings to mutate.
        `min_mutations` - the minimum number of mutations to apply.
        `max_mutations` - the maximum number of mutations to apply.
        """
        self.seed = seed
        self.min_mutations = min_mutations
        self.max_mutations = max_mutations
        self.reset()

    def reset(self) -> None:
        """Set population to initial seed.
        To be overloaded in subclasses."""
        self.population = self.seed
        self.seed_index = 0

In the following, let us develop MutationFuzzer further by adding more methods to it. The Python language requires us to define an entire class with all methods as a single, continuous unit; however, we would like to introduce one method after another. To avoid this problem, we use a special hack: Whenever we want to introduce a new method to some class C, we use the construct

class C(C):
    def new_method(self, args):
        pass

This seems to define C as a subclass of itself, which would make no sense – but actually, it introduces a new C class as a subclass of the old C class, and then shadowing the old C definition. What this gets us is a C class with new_method() as a method, which is just what we want. (C objects defined earlier will retain the earlier C definition, though, and thus must be rebuilt.)

Using this hack, we can now add a mutate() method that actually invokes the above mutate() function. Having mutate() as a method is useful when we want to extend a MutationFuzzer later.

In [38]:

class MutationFuzzer(MutationFuzzer):
    def mutate(self, inp: str) -> str:
        return mutate(inp)

Let's get back to our strategy, maximizing diversity in coverage in our population. First, let us create a method create_candidate(), which randomly picks some input from our current population (self.population), and then applies between min_mutations and max_mutations mutation steps, returning the final result:

In [39]:

class MutationFuzzer(MutationFuzzer):
    def create_candidate(self) -> str:
        """Create a new candidate by mutating a population member"""
        candidate = random.choice(self.population)
        trials = random.randint(self.min_mutations, self.max_mutations)
        for i in range(trials):
            candidate = self.mutate(candidate)
        return candidate

The fuzz() method is set to first pick the seeds; when these are gone, we mutate:

In [40]:

class MutationFuzzer(MutationFuzzer):
    def fuzz(self) -> str:
        if self.seed_index < len(self.seed):
            # Still seeding
            self.inp = self.seed[self.seed_index]
            self.seed_index += 1
        else:
            # Mutating
            self.inp = self.create_candidate()
        return self.inp

Here is the fuzz() method in action. With every new invocation of fuzz(), we get another variant with multiple mutations applied.

In [41]:

seed_input = "http://www.google.com/search?q=fuzzing"
mutation_fuzzer = MutationFuzzer(seed=[seed_input])
mutation_fuzzer.fuzz()

Out[41]:

'http://www.google.com/search?q=fuzzing'

In [42]:

mutation_fuzzer.fuzz()

Out[42]:

'http://www.gogl9ecom/earch?qfuzzing'

In [43]:

mutation_fuzzer.fuzz()

Out[43]:

'htotq:/www.googleom/yseach?q=fzzijg'

The higher variety in inputs, though, increases the risk of having an invalid input. The key to success lies in the idea of guiding these mutations – that is, keeping those that are especially valuable.

Guiding by Coverage¶

To cover as much functionality as possible, one can rely on either specified or implemented functionality, as discussed in the "Coverage" chapter. For now, we will not assume that there is a specification of program behavior (although it definitely would be good to have one!). We will assume, though, that the program to be tested exists – and that we can leverage its structure to guide test generation.

Since testing always executes the program at hand, one can always gather information about its execution – the least is the information needed to decide whether a test passes or fails. Since coverage is frequently measured as well to determine test quality, let us also assume we can retrieve coverage of a test run. The question is then: How can we leverage coverage to guide test generation?

One particularly successful idea is implemented in the popular fuzzer named American fuzzy lop, or AFL for short. Just like our examples above, AFL evolves test cases that have been successful – but for AFL, "success" means finding a new path through the program execution. This way, AFL can keep on mutating inputs that so far have found new paths; and if an input finds another path, it will be retained as well.

Let us build such a strategy. We start with introducing a Runner class that captures the coverage for a given function. First, a FunctionRunner class:

In [44]:

from Fuzzer import Runner

In [45]:

class FunctionRunner(Runner):
    def __init__(self, function: Callable) -> None:
        """Initialize.  `function` is a function to be executed"""
        self.function = function

    def run_function(self, inp: str) -> Any:
        return self.function(inp)

    def run(self, inp: str) -> Tuple[Any, str]:
        try:
            result = self.run_function(inp)
            outcome = self.PASS
        except Exception:
            result = None
            outcome = self.FAIL

        return result, outcome

In [46]:

http_runner = FunctionRunner(http_program)
http_runner.run("https://foo.bar/")

Out[46]:

(True, 'PASS')

We can now extend the FunctionRunner class such that it also measures coverage. After invoking run(), the coverage() method returns the coverage achieved in the last run.

In [47]:

from Coverage import Coverage, population_coverage, Location

In [48]:

class FunctionCoverageRunner(FunctionRunner):
    def run_function(self, inp: str) -> Any:
        with Coverage() as cov:
            try:
                result = super().run_function(inp)
            except Exception as exc:
                self._coverage = cov.coverage()
                raise exc

        self._coverage = cov.coverage()
        return result

    def coverage(self) -> Set[Location]:
        return self._coverage

In [49]:

http_runner = FunctionCoverageRunner(http_program)
http_runner.run("https://foo.bar/")

Out[49]:

(True, 'PASS')

Here are the first five locations covered:

In [50]:

print(list(http_runner.coverage())[:5])

[('http_program', 3), ('urlparse', 394), ('urlparse', 400), ('_noop', 104), ('urlsplit', 460)]

Now for the main class. We maintain the population and a set of coverages already achieved (coverages_seen). The fuzz() helper function takes an input and runs the given function() on it. If its coverage is new (i.e. not in coverages_seen), the input is added to population and the coverage to coverages_seen.

In [51]:

class MutationCoverageFuzzer(MutationFuzzer):
    """Fuzz with mutated inputs based on coverage"""

    def reset(self) -> None:
        super().reset()
        self.coverages_seen: Set[frozenset] = set()
        # Now empty; we fill this with seed in the first fuzz runs
        self.population = []

    def run(self, runner: FunctionCoverageRunner) -> Any:  # type: ignore
        """Run function(inp) while tracking coverage.
           If we reach new coverage,
           add inp to population and its coverage to population_coverage
        """
        result, outcome = super().run(runner)
        new_coverage = frozenset(runner.coverage())
        if outcome == Runner.PASS and new_coverage not in self.coverages_seen:
            # We have new coverage
            self.population.append(self.inp)
            self.coverages_seen.add(new_coverage)

        return result

Let us now put this to use:

In [52]:

seed_input = "http://www.google.com/search?q=fuzzing"
mutation_fuzzer = MutationCoverageFuzzer(seed=[seed_input])
mutation_fuzzer.runs(http_runner, trials=10000)
mutation_fuzzer.population

Out[52]:

['http://www.google.com/search?q=fuzzing',
 'http://www.goog.com/search;q=fuzzilng',
 'http://ww.6goog\x0eoomosearch;/q=f}zzilng',
 'http://uv.Lboo.comoseakrch;q=fuzilng',
 'http://ww.6goog\x0eo/mosarch;/q=f}z{il~g',
 'http://www.googme.com/sear#h?q=fuzzing',
 'http://www.oogcom/sa3rchq=fuzlnv|',
 'http://ww.6goog*./mosarch;/q=f}Zz{ilel~g',
 'http://uv.Lboo.comoseakch;q=fuzilng',
 'http://www.goom^e.2com/s?ear#h?q=fuzzing',
 'http://hwww.coole.com+search?R=fuzzig',
 'http://ww.6g7oog*./mosarch; #/q;f}Zz{ilel~gL',
 "http://ww.6'oog*R./mosarcx;/q=}Zz{ilel;~g",
 'http://www.goofme.com/sear#h?q=fuzzi*yng',
 "http://sw.6'oog*R/msa'rcx;/qw?}Zz{ileRl;~g",
 "http://sw.6'oog*R/msa'rsx;/qw?}Zz{ileRUl;~g",
 "http://sw.6'oog*R/msa'rsx;qw?}Zz{ileRU;~g",
 'http://wgw.gooBm^e.2com/s?&eir#h?q=]fuzzing',
 "http://sw.6'ooM*R/mDa'rsx;w?}Zz{ileU+~g",
 "http://sw.6L'ooM*R/mKD'rwx;w?}Z~{ileU#zg",
 'http://ww6g7ooVg:./mosarc; #/q;f}ZzF{ielW~gL',
 "http://Jsw.6L'oM*R/mKD'r3w;w?~{ileU#zg",
 "http://sw.6'oog*R/msa'rsx;/qw?}Z#z{ileRYUl;~g",
 "http://sw6'oog*V/msa'rsx;/w\x7f}Z#zileRUl;~g",
 "http://sw6'oog*/msa'rsx;/g\x7fp}Z#zileRUl;~g"]

Success! In our population, each and every input now is valid and has a different coverage, coming from various combinations of schemes, paths, queries, and fragments.

In [53]:

all_coverage, cumulative_coverage = population_coverage(
    mutation_fuzzer.population, http_program)

In [54]:

import matplotlib.pyplot as plt  # type: ignore

In [55]:

plt.plot(cumulative_coverage)
plt.title('Coverage of urlparse() with random inputs')
plt.xlabel('# of inputs')
plt.ylabel('lines covered');

The nice thing about this strategy is that, applied to larger programs, it will happily explore one path after the other – covering functionality after functionality. All that is needed is a means to capture the coverage.

Synopsis¶

This chapter introduces a MutationFuzzer class that takes a list of seed inputs which are then mutated:

In [56]:

seed_input = "http://www.google.com/search?q=fuzzing"
mutation_fuzzer = MutationFuzzer(seed=[seed_input])
[mutation_fuzzer.fuzz() for i in range(10)]

Out[56]:

['http://www.google.com/search?q=fuzzing',
 'http://wwBw.google.com/searh?q=fuzzing',
 'http8//wswgoRogle.am/secch?qU=fuzzing',
 'ittp://www.googLe.com/serch?q=fuzzingZ',
 'httP://wgw.google.com/seasch?Q=fuxzanmgY',
 'http://www.google.cxcom/search?q=fuzzing',
 'hFttp://ww.-g\x7fog+le.com/s%arch?q=f-uzz#ing',
 'http://www\x0egoogle.com/seaNrch?q=fuZzing',
 'http//www.Ygooge.comsarch?q=fuz~Ijg',
 'http8//ww.goog5le.com/sezarc?q=fuzzing']

The MutationCoverageFuzzer maintains a population of inputs, which are then evolved in order to maximize coverage.

In [57]:

mutation_fuzzer = MutationCoverageFuzzer(seed=[seed_input])
mutation_fuzzer.runs(http_runner, trials=10000)
mutation_fuzzer.population[:5]

Out[57]:

['http://www.google.com/search?q=fuzzing',
 'http://wwv.oogle>co/search7Eq=fuzing',
 'http://wwv\x0eOogleb>co/seakh7Eq\x1d;fuzing',
 'http://wwv\x0eoglebkooqeakh7Eq\x1d;fuzing',
 'http://wwv\x0eoglekol=oekh7Eq\x1d\x1bf~ing']

In [58]:

# ignore
from ClassDiagram import display_class_hierarchy

In [59]:

# ignore
display_class_hierarchy(MutationCoverageFuzzer,
                        public_methods=[
                            Fuzzer.run,
                            Fuzzer.__init__,
                            Fuzzer.runs,
                            Fuzzer.fuzz,
                            MutationFuzzer.__init__,
                            MutationFuzzer.fuzz,
                            MutationCoverageFuzzer.run,
                        ],
                        types={'Location': Location},
                        project='fuzzingbook')

Out[59]:

Lessons Learned¶

Randomly generated inputs are frequently invalid – and thus exercise mostly input processing functionality.
Mutations from existing valid inputs have much higher chances to be valid, and thus to exercise functionality beyond input processing.

Next Steps¶

In the next chapter on greybox fuzzing, we further extend the concept of mutation-based testing with power schedules that allow spending more energy on seeds that exercise "unlikely" paths and seeds that are "closer" to a target location.

Exercises¶

Exercise 1: Fuzzing CGI decode with Mutations¶

Apply the above guided mutation-based fuzzing technique on cgi_decode() from the "Coverage" chapter. How many trials do you need until you cover all variations of +, % (valid and invalid), and regular characters?

In [60]:

from Coverage import cgi_decode

In [61]:

seed = ["Hello World"]
cgi_runner = FunctionCoverageRunner(cgi_decode)
m = MutationCoverageFuzzer(seed)
results = m.runs(cgi_runner, 10000)

In [62]:

m.population

Out[62]:

['Hello World', 'he_<+llo(or<D', 'L}eml &Wol%dD', 'L)q<}aml &cWol%d3D+']

In [63]:

cgi_runner.coverage()

Out[63]:

{('cgi_decode', 16),
 ('cgi_decode', 17),
 ('cgi_decode', 18),
 ('cgi_decode', 19),
 ('cgi_decode', 20),
 ('cgi_decode', 23),
 ('cgi_decode', 24),
 ('cgi_decode', 25),
 ('cgi_decode', 26),
 ('cgi_decode', 27),
 ('cgi_decode', 29),
 ('cgi_decode', 30),
 ('cgi_decode', 31),
 ('cgi_decode', 32),
 ('cgi_decode', 33),
 ('cgi_decode', 34),
 ('cgi_decode', 38),
 ('cgi_decode', 39),
 ('cgi_decode', 40),
 ('run_function', 7)}

In [64]:

all_coverage, cumulative_coverage = population_coverage(
    m.population, cgi_decode)

import matplotlib.pyplot as plt
plt.plot(cumulative_coverage)
plt.title('Coverage of cgi_decode() with random inputs')
plt.xlabel('# of inputs')
plt.ylabel('lines covered');

After 10,000 runs, we have managed to synthesize a + character and a valid %xx form. We can still do better.

Exercise 2: Fuzzing bc with Mutations¶

Apply the above mutation-based fuzzing technique on bc, as in the chapter "Introduction to Fuzzing".

Part 1: Non-Guided Mutations¶

Start with non-guided mutations. How many of the inputs are valid?

Solution. This is just a matter of tying a ProgramRunner to a MutationFuzzer:

In [65]:

from Fuzzer import ProgramRunner

In [66]:

seed = ["1 + 1"]
bc = ProgramRunner(program="bc")
m = MutationFuzzer(seed)
outcomes = m.runs(bc, trials=100)

In [67]:

outcomes[:3]

Out[67]:

[(CompletedProcess(args='bc', returncode=0, stdout='2\n', stderr=''), 'PASS'),
 (CompletedProcess(args='bc', returncode=0, stdout='5\n', stderr=''), 'PASS'),
 (CompletedProcess(args='bc', returncode=0, stdout='1000\n', stderr=''),
  'PASS')]

In [68]:

sum(1 for completed_process, outcome in outcomes if completed_process.stderr == "")

Out[68]:

Part 2: Guided Mutations¶

Continue with guided mutations. To this end, you will have to find a way to extract coverage from a C program such as bc. Proceed in these steps:

First, get GNU bc; download, say, bc-1.07.1.tar.gz and unpack it:

In [69]:

!curl -O mirrors.kernel.org/gnu/bc/bc-1.07.1.tar.gz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  410k  100  410k    0     0   347k      0  0:00:01  0:00:01 --:--:--  348k

In [70]:

!tar xfz bc-1.07.1.tar.gz

Second, configure the package:

In [71]:

!cd bc-1.07.1; ./configure

checking for a BSD-compatible install... /opt/homebrew/bin/ginstall -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /opt/homebrew/bin/gmkdir -p
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking minix/config.h usability... no
checking minix/config.h presence... no
checking for minix/config.h... no
checking whether it is safe to define __EXTENSIONS__... yes
checking for flex... flex
checking lex output file root... lex.yy
checking lex library... -ll
checking whether yytext is a pointer... yes
checking for ar... ar
checking the archiver (ar) interface... ar
checking for bison... bison -y
checking for ranlib... ranlib
checking whether make sets $(MAKE)... (cached) yes
checking for stdarg.h... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for errno.h... yes
checking for limits.h... yes
checking for unistd.h... (cached) yes
checking for lib.h... no
checking for an ANSI C-conforming const... yes
checking for size_t... yes
checking for ptrdiff_t... yes
checking for vprintf... yes
checking for _doprnt... no
checking for isgraph... yes
checking for setvbuf... yes
checking for fstat... yes
checking for strtol... yes
Adding GCC specific compile flags.
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating bc/Makefile
config.status: creating dc/Makefile
config.status: creating lib/Makefile
config.status: creating doc/Makefile
config.status: creating doc/texi-ver.incl
config.status: creating config.h
config.status: executing depfiles commands

Third, compile the package with special flags:

In [72]:

!cd bc-1.07.1; make CFLAGS="--coverage"

/Applications/Xcode.app/Contents/Developer/usr/bin/make  all-recursive
Making all in lib
gcc -DHAVE_CONFIG_H  -I. -I..  -I. -I.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT getopt.o -MD -MP -MF .deps/getopt.Tpo -c -o getopt.o getopt.c
getopt.c:348:28: warning: passing arguments to 'getenv'
      without a prototype is deprecated in all versions of C and is not
      supported in C2x [-Wdeprecated-non-prototype]
  posixly_correct = getenv ("POSIXLY_CORRECT");
                           ^
In file included from getopt.c:106:
./../h/getopt.h:144:12: warning: a function declaration
      without a prototype is deprecated in all versions of C and is treated as a
      zero-parameter prototype in C2x, conflicting with a subsequent definition
      [-Wdeprecated-non-prototype]
extern int getopt ();
           ^
getopt.c:1135:1: note: conflicting prototype is here
getopt (int argc, char *const *argv, const char *optstring)
^
2 warnings generated.
mv -f .deps/getopt.Tpo .deps/getopt.Po
gcc -DHAVE_CONFIG_H  -I. -I..  -I. -I.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT getopt1.o -MD -MP -MF .deps/getopt1.Tpo -c -o getopt1.o getopt1.c
mv -f .deps/getopt1.Tpo .deps/getopt1.Po
gcc -DHAVE_CONFIG_H  -I. -I..  -I. -I.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT vfprintf.o -MD -MP -MF .deps/vfprintf.Tpo -c -o vfprintf.o vfprintf.c
mv -f .deps/vfprintf.Tpo .deps/vfprintf.Po
gcc -DHAVE_CONFIG_H  -I. -I..  -I. -I.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT number.o -MD -MP -MF .deps/number.Tpo -c -o number.o number.c
mv -f .deps/number.Tpo .deps/number.Po
rm -f libbc.a
ar cru libbc.a getopt.o getopt1.o vfprintf.o number.o 
ranlib libbc.a
Making all in bc
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT main.o -MD -MP -MF .deps/main.Tpo -c -o main.o main.c
In file included from main.c:34:
./../h/getopt.h:144:12: warning: a function declaration
      without a prototype is deprecated in all versions of C and is treated as a
      zero-parameter prototype in C2x, conflicting with a previous declaration
      [-Wdeprecated-non-prototype]
extern int getopt ();
           ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/unistd.h:509:6: note: 
      conflicting prototype is here
int      getopt(int, char * const [], const char *) __DARWIN_ALIAS(getopt);
         ^
1 warning generated.
mv -f .deps/main.Tpo .deps/main.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT bc.o -MD -MP -MF .deps/bc.Tpo -c -o bc.o bc.c
mv -f .deps/bc.Tpo .deps/bc.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT scan.o -MD -MP -MF .deps/scan.Tpo -c -o scan.o scan.c
mv -f .deps/scan.Tpo .deps/scan.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT execute.o -MD -MP -MF .deps/execute.Tpo -c -o execute.o execute.c
mv -f .deps/execute.Tpo .deps/execute.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT load.o -MD -MP -MF .deps/load.Tpo -c -o load.o load.c
mv -f .deps/load.Tpo .deps/load.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT storage.o -MD -MP -MF .deps/storage.Tpo -c -o storage.o storage.c
mv -f .deps/storage.Tpo .deps/storage.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT util.o -MD -MP -MF .deps/util.Tpo -c -o util.o util.c
mv -f .deps/util.Tpo .deps/util.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT warranty.o -MD -MP -MF .deps/warranty.Tpo -c -o warranty.o warranty.c
warranty.c:56:1: warning: a function definition without a
      prototype is deprecated in all versions of C and is not supported in C2x
      [-Wdeprecated-non-prototype]
warranty(prefix)
^
1 warning generated.
mv -f .deps/warranty.Tpo .deps/warranty.Po
echo '{0}' > libmath.h
/Applications/Xcode.app/Contents/Developer/usr/bin/make global.o
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT global.o -MD -MP -MF .deps/global.Tpo -c -o global.o global.c
mv -f .deps/global.Tpo .deps/global.Po
gcc -g -O2 -Wall -funsigned-char --coverage   -o libmath.h -o fbc main.o bc.o scan.o execute.o load.o storage.o util.o warranty.o global.o ../lib/libbc.a -ll  
./fbc -c ./libmath.b </dev/null >libmath.h
./fix-libmath_h
2655
2793
rm -f ./fbc ./global.o
gcc -DHAVE_CONFIG_H -I. -I..  -I. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT global.o -MD -MP -MF .deps/global.Tpo -c -o global.o global.c
mv -f .deps/global.Tpo .deps/global.Po
gcc -g -O2 -Wall -funsigned-char --coverage   -o bc main.o bc.o scan.o execute.o load.o storage.o util.o global.o warranty.o ../lib/libbc.a -ll  
Making all in dc
gcc -DHAVE_CONFIG_H -I. -I..  -I./.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT dc.o -MD -MP -MF .deps/dc.Tpo -c -o dc.o dc.c
mv -f .deps/dc.Tpo .deps/dc.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I./.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT misc.o -MD -MP -MF .deps/misc.Tpo -c -o misc.o misc.c
mv -f .deps/misc.Tpo .deps/misc.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I./.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT eval.o -MD -MP -MF .deps/eval.Tpo -c -o eval.o eval.c
mv -f .deps/eval.Tpo .deps/eval.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I./.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT stack.o -MD -MP -MF .deps/stack.Tpo -c -o stack.o stack.c
mv -f .deps/stack.Tpo .deps/stack.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I./.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT array.o -MD -MP -MF .deps/array.Tpo -c -o array.o array.c
mv -f .deps/array.Tpo .deps/array.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I./.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT numeric.o -MD -MP -MF .deps/numeric.Tpo -c -o numeric.o numeric.c
numeric.c:576:1: warning: a function definition without a
      prototype is deprecated in all versions of C and is not supported in C2x
      [-Wdeprecated-non-prototype]
out_char (ch)
^
1 warning generated.
mv -f .deps/numeric.Tpo .deps/numeric.Po
gcc -DHAVE_CONFIG_H -I. -I..  -I./.. -I./../h  -g -O2 -Wall -funsigned-char --coverage -MT string.o -MD -MP -MF .deps/string.Tpo -c -o string.o string.c
mv -f .deps/string.Tpo .deps/string.Po
gcc -g -O2 -Wall -funsigned-char --coverage   -o dc dc.o misc.o eval.o stack.o array.o numeric.o string.o ../lib/libbc.a 
Making all in doc
restore=: && backupdir=".am$$" && \
	am__cwd=`pwd` && CDPATH="${ZSH_VERSION+.}:" && cd . && \
	rm -rf $backupdir && mkdir $backupdir && \
	if (makeinfo --no-split --version) >/dev/null 2>&1; then \
	  for f in bc.info bc.info-[0-9] bc.info-[0-9][0-9] bc.i[0-9] bc.i[0-9][0-9]; do \
	    if test -f $f; then mv $f $backupdir; restore=mv; else :; fi; \
	  done; \
	else :; fi && \
	cd "$am__cwd"; \
	if makeinfo --no-split   -I . \
	 -o bc.info bc.texi; \
	then \
	  rc=0; \
	  CDPATH="${ZSH_VERSION+.}:" && cd .; \
	else \
	  rc=$?; \
	  CDPATH="${ZSH_VERSION+.}:" && cd . && \
	  $restore $backupdir/* `echo "./bc.info" | sed 's|[^/]*$||'`; \
	fi; \
	rm -rf $backupdir; exit $rc
/bin/sh: makeinfo: command not found
make[3]: *** [bc.info] Error 127
make[2]: *** [all-recursive] Error 1
make[1]: *** [all] Error 2

The file bc/bc should now be executable...

In [73]:

!cd bc-1.07.1/bc; echo 2 + 2 | ./bc

...and you should be able to run the gcov program to retrieve coverage information.

In [74]:

!cd bc-1.07.1/bc; gcov main.c

File 'main.c'
Lines executed:52.55% of 137
Creating 'main.c.gcov'

As sketched in the "Coverage" chapter, the file bc-1.07.1/bc/main.c.gcov now holds the coverage information for bc.c. Each line is prefixed with the number of times it was executed. ##### means zero times; - means non-executable line.

Parse the GCOV file for bc and create a coverage set, as in FunctionCoverageRunner. Make this a ProgramCoverageRunner class that would be constructed with a list of source files (bc.c, main.c, load.c) to run gcov on.

When you're done, don't forget to clean up:

In [75]:

!rm -fr bc-1.07.1 bc-1.07.1.tar.gz

Exercise 3¶

In this blog post, the author of American Fuzzy Lop (AFL), a very popular mutation-based fuzzer discusses the efficiency of various mutation operators. Implement four of them and evaluate their efficiency as in the examples above.

Exercise 4¶

When adding a new element to the list of candidates, AFL does actually not compare the coverage, but adds an element if it exercises a new branch. Using branch coverage from the exercises of the "Coverage" chapter, implement this "branch" strategy and compare it against the "coverage" strategy, above.

Exercise 5¶

Design and implement a system that will gather a population of URLs from the Web. Can you achieve a higher coverage with these samples? What if you use them as initial population for further mutation?