#!/usr/bin/env python # coding: utf-8 # # Tours through the Book # # This book is _massive_. With more than 20,000 lines of code and 150,000 words of text, a printed version would cover more than 1,200 pages of text. Obviously, we do not assume that everybody wants to read everything. # While the chapters of this book can be read one after the other, there are many possible paths through the book. In this graph, an arrow $A \rightarrow B$ means that chapter $A$ is a prerequisite for chapter $B$. You can pick arbitrary paths in this graph to get to the topics that interest you most: # In[1]: # ignore from bookutils import rich_output # In[2]: # ignore sitemap = None if rich_output(): from IPython.display import SVG sitemap = SVG(filename='PICS/Sitemap.svg') sitemap # But since even this map can be overwhelming, here are a few _tours_ to get you started. Each of these tours allows you to focus on a particular view, depending on whether you are a programmer, student, or researcher. # ## The Pragmatic Programmer Tour # # You have a program to test. You want to generate tests as quickly as possible and as thoroughly as possible. You don't care so much _how_ something is implemented, but it should get the job done. You want to learn how to _use_ things. # # 1. __Start with [Introduction to Testing](Intro_Testing.ipynb) to get the basic concepts.__ (You would know most of these anyway, but it can't hurt to get quick reminders). # # 2. __Use the simple fuzzers from [the chapter on Fuzzers](Fuzzer.ipynb)__ to test your program against the first random inputs. # # 3. __Get [coverage](Coverage.ipynb) from your program__ and use coverage information to [guide test generation towards code coverage](GreyboxFuzzer.ipynb). # # 4. __Define an [input grammar](Grammars.ipynb) for your program__ and use this grammar to thoroughly fuzz your program with syntactically correct inputs. As fuzzer, we would recommend a [grammar coverage fuzzer](GrammarCoverageFuzzer), as this ensures coverage of input elements. # # 5. If you want __more control over the generated inputs,__ consider [probabilistic fuzzing](ProbabilisticGrammarFuzzer.ipynb) and [fuzzing with generator functions](GeneratorGrammarFuzzer.ipynb). # # 6. If you want to __deploy a large set of fuzzers__, learn how to [manage a large set of fuzzers](FuzzingInTheLarge.ipynb). # # In each of these chapters, start with the "Synopsis" parts; these will give you quick introductions on how to use things, as well as point you to relevant usage examples. With this, enough said. Get back to work and enjoy! # ## The Page-by-Page Tours # # These tours are how the book is organized. Having gone through the [Introduction to Testing](Intro_Testing.ipynb) for the basic concepts, you can read your way through these parts: # # 1. __The [lexical tour](02_Lexical_Fuzzing.ipynb)__ focuses on _lexical_ test generation techniques, i.e. techniques that compose an input character by character and byte by byte. Very fast and robust techniques with a minimum of bias. # # 1. __The [syntactical tour](03_Syntactical_Fuzzing.ipynb)__ focuses on _grammars_ as a means to specify the syntax of inputs. The resulting test generators produce syntactically correct inputs, making tests much faster, and provide lots of control mechanisms for the tester. # # 1. __The [semantic tour](04_Semantical_Fuzzing.ipynb)__ makes use of _code semantics_ to shape and guide test generation. Advanced techniques include extracting input grammars, mining function specifications, and symbolic constraint solving to cover as many code paths as possible. # # 1. __The [application tour](05_Domain-Specific_Fuzzing.ipynb)__ applies the techniques defined in the earlier parts on domains such as Web servers, user interfaces, APIs, or configurations. # # 1. __The [management tour](06_Managing_Fuzzing.ipynb)__ finally focuses on how to handle and organize large sets of test generators, and when to stop fuzzing. # Most of these chapters start with a "Synopsis" section that explains how to use the most important concepts. You can choose whether you want a "usage" perspective (then just read the synopsis) or an "understanding" perspective (then read on). # ## The Undergraduate Tour # # You are a student of computer science and/or software engineering. You want to know basics of testing and related fields. Beyond just _using_ techniques, you want to dig deeper into algorithms and implementations. We have the following recommendation for you: # # 1. Start with [Introduction to Testing](Intro_Testing.ipynb) and [Coverage](Coverage.ipynb) to get the __basic concepts.__ (You may know some of these already, but hey, you're a student, right?) # # 2. __Learn how simple fuzzers work__ from [the chapter on Fuzzers](Fuzzer.ipynb). This already gives you tools that took down 30% of UNIX utilities in the 90s. What happens if you test some tool that has never been fuzzed before? # # 3. __[Mutation-based fuzzing](MutationFuzzer.ipynb)__ is pretty much the standard in fuzzing today: Take a set of seeds, and mutate them until we find a bug. # # 4. __Learn how [grammars](Grammars.ipynb) can be used to generate syntactically correct inputs.__ This makes test generation much more efficient, but you have to write (or [mine](GrammarMiner.ipynb)) a grammar in the first place. # # 5. __Learn how to [fuzz APIs](APIFuzzer.ipynb) and [graphical user interfaces](GUIFuzzer.ipynb)__. Both of these are important domains for software test generation. # # 6. __Learn how to [reduce failure-inducing inputs](Reducer.ipynb) to a minimum automatically__. This is a great time saver for debugging, especially in conjunction with automated testing. # # For all these chapters, experiment with the implementations to understand their concepts. Feel free to experiment as you wish. # If you are a teacher, the above chapters can be useful in programming and/or software engineering courses. Make use of slides and/or live programming, and have students work on exercises. # ## The Graduate Tour # # On top of the "Undergraduate" tour, you want to get deeper into test generation techniques, including techniques that are more demanding. # # 1. __[Search-based testing](SearchBasedFuzzer.ipynb)__ allows you to guide test generation towards specific goals, such as code coverage. Robust and efficient. # # 1. Get an introduction to __[configuration testing](ConfigurationFuzzer.ipynb)__. How does one test and cover a system that comes with multiple configuration options? # # 1. __[Mutation analysis](MutationAnalysis.ipynb)__ seeds synthetic defects (mutations) into program code to check whether the tests find them. If the tests do not find mutations, they likely won't find real bugs either. # # 1. __Learn how to [parse](Parser.ipynb) inputs__ using grammars. If you want to analyze, decompose, mutate existing inputs, you need a parser for that. # # 1. __[Concolic](ConcolicFuzzer.ipynb) and [symbolic](SymbolicFuzzer.ipynb) fuzzing__ solve constraints along program paths to reach code that is hard to test. Used wherever reliability is paramount; also a hot research topic. # # 1. __Learn how to [estimate when to stop fuzzing](WhenToStopFuzzing.ipynb)__. There has to be a stop at some point, right? # # For all these chapters, experiment with the code; feel free to create your own variations and extensions. This is how we get to research! # If you are a teacher, the above chapters can be useful in advanced courses on software engineering and testing. Again, you can make use of slides and/or live programming, and have students work on exercises. # ## The Black-Box Tour # # This tour focuses on _black-box fuzzing_ – that is, techniques that work without feedback from the program under test. Have a look at # # 1. __[Basic fuzzing](Fuzzer.ipynb)__. This already gives you tools that took down 30% of UNIX utilities in the 90s. What happens if you test some tool that has never been fuzzed before? # # 2. __[Syntactical fuzzing](03_Syntactical_Fuzzing.ipynb)__ focuses on _grammars_ as a means to specify the syntax of inputs. The resulting test generators produce syntactically correct inputs, making tests much faster, and provide lots of control mechanisms for the tester. # # 3. __[Semantic fuzzing](FuzzingWithconstraints.ipynb)__ attaches _constraints_ to grammars, making inputs not only syntactically valid, but also _semantically_ valid - and empowering you to shape test inputs just like you want them, # # 4. __[Domain-specific fuzzing](05_Domain-Specific_Fuzzing.ipynb)__ showing a number of applications of these techniques, from configurations to graphical user interfaces. # # 5. If you want to __deploy a large set of fuzzers__, learn how to [manage a large set of fuzzers](FuzzingInTheLarge.ipynb). # ## The White-Box Tour # # This tour focuses on _white-box fuzzing_ – that is, techniques that leverage feedback from the program under test. Have a look at # # 1. __[Coverage](Coverage.ipynb)__ to get the basic concepts of coverage and how to measure it for Python. # # 2. __[Mutation-based fuzzing](MutationFuzzer.ipynb)__ is pretty much the standard in fuzzing today: Take a set of seeds, and mutate them until we find a bug. # # 3. __[Greybox fuzzing](GreyboxFuzzer.ipynb)__ with algorithms from the popular American Fuzzy Lop (AFL) fuzzer. # # 4. __[Information Flow](InformationFlow.ipynb)__ and __[Concolic Fuzzing](ConcolicFuzzer.ipynb)__ showing how to capture information flow in Python programs and how to leverage it to produce more intelligent test cases. # # 5. __[Symbolic Fuzzing](SymbolicFuzzer.ipynb)__, reasoning about the behavior of a program without executing it. # ## The Researcher Tour # # On top of the "Graduate" tour, you are looking for techniques that are somewhere between lab stage and widespread usage – in particular, techniques where there is still room for lots of improvement. If you look for research ideas, go for these topics. # # 1. __[Mining function specifications](DynamicInvariants.ipynb)__ is a hot topic in research: Given a function, how can we infer an abstract model that describes its behavior? The conjunction with test generation offers several opportunities here, in particular for dynamic specification mining. # # 2. __[Mining input grammars](GrammarMiner.ipynb)__ promises to join the robustness and ease of use of lexical fuzzing with the efficiency and speed of syntactical fuzzing. The idea is to mine an input grammar from a program automatically, which then serves as base for syntactical fuzzing. Still in an early stage, but lots of potential. # # 3. __[Probabilistic grammar fuzzing](ProbabilisticGrammarFuzzer.ipynb)__ gives programmers lots of control over which elements should be generated. Plenty of research possibilities at the intersection of probabilistic fuzzing and mining data from given tests, as sketched in this chapter. # # 4. __[Fuzzing with generators](GeneratorGrammarFuzzer.ipynb)__ and __[Fuzzing with constraints](FuzzingWithConstraints.ipynb)__ gives programmers the ultimate control over input generation, namely by allowing them to define their own generator functions or to define their own input constraints. The big challenge is: How can one best exploit the power of syntactic descriptions with a minimum of contextual constraints? # # 5. __[Carving unit tests](Carver.ipynb)__ brings the promise of speeding up test execution (and generation) dramatically, by extracting unit tests from program executions that replay only individual function calls (possibly with new, generated arguments). In Python, carving is simple to realize; here's plenty of potential to toy with. # # 6. __Testing [web servers](WebFuzzer.ipynb) and [GUIs](GUIFuzzer.ipynb)__ is a hot research field, fueled by the need of practitioners to test and secure their interfaces (and the need of other practitioners to break through these interfaces). Again, there's still lots of unexplored potential here. # # 7. __[Greybox fuzzing](GreyboxFuzzer.ipynb) and [greybox fuzzing with grammars](GreyboxGrammarFuzzer.ipynb)__ bring in _statistical estimators_ to guide test generation towards inputs and input properties that are most likely to discover new bugs. The intersection of testing, program analysis, and statistics offers lots of possibilities for future research. # For all these topics, having Python source available that implements and demonstrates the concepts is a major asset. You can easily extend the implementations with your own ideas and run evaluations right in a notebook. Once your approach is stable, consider porting it to a language with a wider range of available subjects (such as C, for example). # ## The Author Tour # # This is the ultimate tour – you have learned everything there is and want to contribute to the book. Then, you should read two more chapters: # # 1. The __[guide for authors](Guide_for_Authors.ipynb)__ gives an introduction on how to contribute to this book (coding styles, writing styles, conventions, and more). # # 2. The __[template chapter](Template.ipynb)__ serves as a blueprint for your chapter. # # If you want to contribute, feel free to contact us – preferably before writing, but after writing is fine just as well. We will be happy to incorporate your material. # ## Lessons Learned # # * You can go through the book from beginning to end... # * ...but it may be preferable to follow a specific tour, based on your needs and resources. # * Now [go and explore generating software tests](index.ipynb)!