import fuzzingbook_utils
!pip install coverage gcovr
Requirement already satisfied: coverage in /opt/conda/lib/python3.6/site-packages (4.5.2) Requirement already satisfied: gcovr in /opt/conda/lib/python3.6/site-packages (4.1) Requirement already satisfied: jinja2 in /opt/conda/lib/python3.6/site-packages (from gcovr) (2.10) Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/lib/python3.6/site-packages (from jinja2->gcovr) (1.0)
Testing regular expression (regex) parsers is a challenging task. A simple regex, such as a?(b|c)+d*
, requires dozens of inputs to cover all possibilities.
To test regex, we use the python regex module (provided in the data/regex
directory of this project). We use the source code of the regex
module so that we can compute the coverage of the module.
from data.regex.regex_3 import _regex_core
from data.regex.regex_3 import regex
pattern = 'a?(b|c)+d*$'
print('a matches %s: %s' % (pattern, regex.match(pattern, 'a') != None))
print('b matches %s: %s' % (pattern, regex.match(pattern, 'ab') != None))
a matches a?(b|c)+d*$: False b matches a?(b|c)+d*$: True
An efficient approach to test regex is to use grammars. Regex are instances of regular languages - a subset of context-free grammar (CFG) - and, thus, can be tested with the techniques presented in this course.
Assume the following grammar, which represents the a?(b|c)+d*
regular expression.
example_grammar = {'<start>': ['<A><BC><D>'],
'<A>': ['a', '<empty>'],
'<BC>': ['b<BC2>', 'c<BC2>'],
'<BC2>': ['b<BC2>', 'c<BC2>', '<empty>'],
'<D>': ['d<D>', '<empty>'],
'<empty>': [''],
}
Reusing the techniques from the lecture we expand and visualize it.
from GrammarFuzzer import GrammarFuzzer, display_tree
def as_tree(grammar):
f = GrammarFuzzer(grammar)
derivation_tree = f.init_tree()
for i in range(6):
derivation_tree = f.expand_tree_once(derivation_tree)
display_tree(derivation_tree)
as_tree(example_grammar)
And we can use it to produce valid input values
def test_input(grammar):
f = GrammarFuzzer(grammar)
input_value = f.fuzz()
print('%s matches %s: %s' % (input_value, pattern, regex.match(pattern, input_value) != None))
for i in range(5):
test_input(example_grammar)
ccbbcbbb matches a?(b|c)+d*$: True abbbbd matches a?(b|c)+d*$: True c matches a?(b|c)+d*$: True bbc matches a?(b|c)+d*$: True bbcbbccbcbcc matches a?(b|c)+d*$: True
The goal of this project is to implement an algorithm that constructs a CFG from an arbitrary regular expression. Then, use the techniques learned in the lecture (i.e. efficient grammar-based fuzzing) to automatically expand your grammar and to automatically produce a set of inputs that match the given regular expression, covering as many grammar productions as possible.
In this project, we restrict the regex specification into its simpler constructions and grouped then into 3 categories: basic syntax, complex syntax and bonus as follows:
Category | Components |
---|---|
Basic Syntax | Anchors ^ and $ Groupings () Whitespace \s = [ \t\n\r\f\v] Quantifiers * , + , ? concatenation OR operator | |
Complex syntax | Character classes [0-9] , [a-z] , [A-Z] , [0-9A-Za-z] Special character classes \d = [0-9] , \w = [0-9a-zA-Z_] Bounded quantifier {} Set Negation [^] (requires look-ahead) \D = [^0-9] \W = [^0-9a-zA-Z_] \S = [^\s] |
Bonus | Boundaries \b and \B |
In your implementation, ignore any non-listed regex elements, such as flags, back references, other types of look-ahead, look-behind and greedy/lazy match.
I have also updated the project accordingly, you can git pull
to obtain the
latest version of the project.
We use the regex parser, version 2.4.151, instead of Python's native re
, because re
is just a wrapper for a C library. Confirm regex version using:
print(regex.__version__)
2.4.151
Ensure your implementation works on the following regex examples, and covers the provided minimum required coverage for parsing a represententative input for each regex:
Category | Examples | Min. Cov. LOC _regex_core.py | Min. Cov. LOC *.c |
---|---|---|---|
Basic Syntax | ^ab$ |
241 (8%) | 988 (7%) |
Basic Syntax | b$ |
170 (6%) | 949 (6%) |
Basic Syntax | ab*$ |
207 (7%) | 1142 (8%) |
Basic Syntax | a|b |
308 (11%) | 1024 (7%) |
Basic Syntax | a(b|c) |
321 (11%) | 1198 (8%) |
Basic Syntax | (a|b)\s |
334 (12%) | 1131 (8%) |
Basic Syntax | ^a*(b+|c)d? |
413 (15%) | 1442 (10%) |
Basic Syntax | ^a*(b+|c)d?\se |
455 (16%) | 1632 (11%) |
Basic Syntax | ^a+(b*|c)d?\se |
426 (15%) | 1564 (11%) |
Complex Syntax | ^[0-9]+\s[a-z]*\s?[A-Z]+$ |
303 (11%) | 1228 (9%) |
Complex Syntax | \w$ |
169 (6%) | 878 (6%) |
Complex Syntax | \d$ |
169 (6%) | 875 (6%) |
Complex Syntax | (a|b){2,5}$ |
401 (14%) | 1380 (10%) |
Complex Syntax | [0-9a-zA-Z]$ |
252 (9%) | 902 (6%) |
Complex Syntax | [0-9]$ |
222 (8%) | 852 (6%) |
Complex Syntax | [a-z]$ |
222 (8%) | 852 (6%) |
Complex Syntax | [^a-zA-Z]$ |
213 (7%) | 879 (6%) |
Complex Syntax | [^0-9]$ |
178 (6%) | 825 (6%) |
Complex Syntax | \D$ |
168 (6%) | 877 (6%) |
Complex Syntax | \W$ |
122 (4%) | 852 (6%) |
Complex Syntax | \S$ |
122 (4%) | 852 (6%) |
Complex Syntax | [a-zA-Z]\s\w+\s\d\s$ |
324 (11%) | 1295 (9%) |
Complex Syntax | a(bc){2,5}$ |
297 (10%) | 1472 (10%) |
Bonus | \babc\b$ |
258 (9%) | 865 (6%) |
Bonus | abc\B |
177 (6%) | 987 (7%) |
Bonus | \babc\B |
247 (9%) | 928 (6%) |
Note that the code coverage during the execution of the import statement (import regex
) is unacccounted for, since this module was executed before the collection of coverage data began. We also do not track or evaluate the coverage of the python code (regex.py
), since it is always the same forregex.match
method call.
Coverage data for Python (i.e. _regex_core.py
) were obtained using a python utility for measuring code coverage of Python programs called Coverage.py, version 4.5.2. Confirm your Coverage.py version:
!coverage --version
Coverage.py, version 4.5.2 with C extension Documentation at https://coverage.readthedocs.io
Coverage data for the C extensions of regex (i.e. _regex.c and _regex_unicode.c) were obtained using the python utility for GNU gcov utility Gcovr, version 4.1. You can confirm your Gcovr version using:
!gcovr --version
gcovr 4.1 Copyright 2013-2018 the gcovr authors Copyright 2013 Sandia Corporation Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains certain rights in this software.
Your generated test inputs will be executed on the fuzzingbook server, based on the amount of executed methods and lines of code (LOC).
Note: if you are developing locally, use the source code of version 2018.11.07 of the regex parser, however test on the fuzzingbook server before submission.
Your implementation will be tested with a secret set of regex, encompassing all previously specified syntax categories.
For grading:
The following functions are available to assist your implementation and tests:
Using the GrammarCoverageFuzzer, produce unique set of inputs from your grammar:
from GrammarCoverageFuzzer import GrammarCoverageFuzzer
from Grammars import is_valid_grammar, def_used_nonterminals
import random
def generate_inputs(grammar):
assert is_valid_grammar(grammar)
start_seed, end_seed = 2000, 2005
inputs = []
for seed in range(start_seed, end_seed):
random.seed(seed)
f = GrammarCoverageFuzzer(grammar)
while (len(f.max_expansion_coverage() - f.expansion_coverage()) > 0):
inputs.append(f.fuzz())
return set(inputs)
Your grammar should produce only valid inputs. To ensure that all inputs your grammar can produce are valid you can use:
def ensure_all_valid(pattern, inputs):
res = [[inp, regex.match(pattern, inp) != None] for inp in inputs]
failures = list(filter(lambda x: not x[1], res))
if len(failures) > 0:
raise ValueError("Invalid inputs: " + str([x[0] for x in failures]))
You can use the run_with_py_coverage
function to obtain the python code covered by each input generated by your grammar.
from coverage import Coverage
import os
def run_with_py_coverage(pattern, inp):
cwd = os.getcwd()
target_dir = os.path.join(cwd,"data", "regex")
os.chdir(target_dir)
regex._cache = {}
cov = Coverage(include=["*.py"])
cov.start()
regex.match(pattern, str(inp))
cov.stop()
cov.save()
results = cov.analysis2("regex_3/_regex_core.py")
covered = set(results[1]) - set(results[3])
os.chdir(cwd)
return covered, len(set(results[1]))
We implement helper function to execute bash commands:
import subprocess
def execute_bash_command(command):
process = subprocess.Popen(command, shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
out, error = process.communicate()
error = error.decode("utf-8").strip()
if error:
raise ValueError("bash command error: '{0}',\n during execution of command: \n '{1}': ".format(error, command ))
res = out.decode("utf-8").strip()
return res
We implement the function replace_all_special_chars()
to escape all special chars in bash command.
def replace_all_special_chars(inp):
special_char_repl = {"\\": "\\\\\\", "\n": "\\\n", "\r": "\\\r", \
"\'": "\\\'", '"':'\\"', '`':'\\`'}
for key in special_char_repl:
inp = inp.replace(key, special_char_repl[key])
return inp
We implement the function run_with_c_coverage()
to obtain coverage for the C classes in the regex module.
import os
def run_with_c_coverage(pattern, inp):
cwd = os.getcwd()
target_dir = os.path.join(cwd , "data", "regex")
if os.getcwd() != target_dir and os.path.isdir(target_dir):
os.chdir(target_dir)
#remove old build
build_dir = os.path.join(os.getcwd(), "build")
if os.path.isdir(build_dir):
rm_command = "rm -r " + build_dir
execute_bash_command(rm_command)
#re-build regex
c_flags = 'CFLAGS="-fprofile-arcs -ftest-coverage -fPIC -g -O0" '
build_instr = 'python setup.py build_ext --inplace > /dev/null'
build_command = c_flags + build_instr
execute_bash_command(build_command)
#run regex match test
run_test_command = """python -c "from regex_3 import _regex_core;\
from regex_3 import regex;pattern = '{0}'; inp = '{1}';\
regex._cache = {{}}; regex.match(pattern, str(inp))" """.\
format(pattern, replace_all_special_chars(inp))
execute_bash_command(run_test_command)
#obtain coverage of c extensions
gcovr_instr = 'gcovr -r . '
grep_TOTAL = 'grep -n "TOTAL" '
c_cov_command = gcovr_instr + "| " + grep_TOTAL
cov_out = execute_bash_command(c_cov_command)
coverage = cov_out.split()[2]
percent_coverage = cov_out.split()[3]
os.chdir(cwd)
return coverage, percent_coverage
For all your inputs, use population_coverage
to obtain the overall cummulative coverage of Python code and the maximum coverage for C Code.
def population_coverage(pattern, population):
py_all_coverage = set()
c_max_coverage = 0
c_max_percent = 0
py_max_percent = ""
for inp in population:
#obtain all covered LOC for Python code
py_covered, py_length = run_with_py_coverage(pattern, inp)
py_all_coverage |= py_covered
#obtain max coverage for C code
c_covered, percent_covered = run_with_c_coverage(pattern, inp)
if c_max_coverage < int(c_covered):
c_max_coverage = int(c_covered)
c_max_percent = percent_covered
py_max_percent = str(int(len(py_all_coverage)/py_length * 100)) + "%"
return len(py_all_coverage), c_max_coverage, py_max_percent, c_max_percent
Write your code here to create a grammar out of an arbitrary regular expression.
import string
from collections import defaultdict
grammar = defaultdict(list)
#write your code below
#regex to context free grammar conversion
def regex_to_CFG(regex_pattern):
return grammar
The following code will be used to run your evaluation.
Note that the set of regular expression which will be used for evaluation may be different from those provided as example.
secret_set_of_regex = [[r'^ab$', 241, 988], [r'b$', 170, 949],[r'ab*$', 207, 1142],\
[r'a|b$',308, 1024], [r'a(b|c)$', 321, 1198], [r'(a|b)\s$',334, 1131],\
[r'^a*(b+|c)d?$', 413, 1442],[r'^a*(b+|c)d?\se$', 455, 1632],\
[r'^a+(b*|c)d?\se$',426, 1564], [r'^[0-9]+\s[a-z]*\s?[A-Z]+$', 303, 1228],\
[r'\w$', 169, 878], [r'\d$', 169, 875], [r'(a|b){2,5}$', 401, 1380],\
[r'[0-9a-zA-Z]$', 252, 902], [r'[0-9]$', 222, 852], [r'[a-z]$', 222, 852],\
[r'[^a-zA-Z]', 213, 879], [r'[^0-9]', 178, 825],\
[r'\D$', 168, 877], [r'\W', 122, 852] , [r'\S', 122, 852],\
[r'[a-zA-Z]\s\w+\s\d\s$', 324, 1295], [r'a(bc){2,5}$', 297, 1472],\
[r'\babc\b$', 258, 865], [r'abc\B', 177, 987], [r'\babc\B', 247, 928]]
to_process = []
for pattern, py_min_coverage, c_min_coverage in secret_set_of_regex:
grammar = regex_to_CFG(pattern)
to_process.append([pattern, py_min_coverage, c_min_coverage, grammar])
Each regex will be evaluated individually, against a minimum precomputed coverage. Plot coverage result and indicate errors.
def evaluate(grammar, pattern, py_min_coverage, c_min_coverage):
inputs = generate_inputs(grammar)
ensure_all_valid(pattern, inputs)
py_all_coverage, c_max_coverage, py_percent, c_percent = population_coverage(pattern, inputs)
return py_all_coverage >= py_min_coverage and c_max_coverage >= c_min_coverage, py_all_coverage, c_max_coverage, py_percent, c_percent
Plot coverage result and indicate errors.
import numpy as np
import matplotlib.pyplot as plt
def plot_bar_chart(py_cov_tuple, c_cov_tuple, py_err, c_err, num_grammars):
fig, ax = plt.subplots(figsize=(20,10))
index = np.arange(num_grammars)
bar_width = 0.4
opacity = 0.4
error_config = {'ecolor': '0.3'}
rects1 = ax.bar(index, py_cov_tuple, bar_width,
alpha=opacity, color='b', yerr=py_err,
error_kw=error_config,
label='Python Code')
rects2 = ax.bar(index + bar_width, c_cov_tuple, bar_width,
alpha=opacity, color='r', yerr=c_err,
error_kw=error_config,
label='C Code')
ax.set_xlabel('Regex')
ax.set_ylabel('Coverage (LOC)')
ax.set_title('Coverage of Regex by Language (Python & C)')
ax.set_xticks(index + bar_width / 2)
ax.set_xticklabels(('1', '2', '3', '4', '5', '6', '7', '8', '9', '10',\
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20',\
'21', '22', '23', '24', '25', '26'))
ax.legend()
fig.tight_layout()
plt.show()
Final result will be computed accordingly:
def compute_results():
overall_result, py_cov_list, c_cov_list, py_err, c_err, evaluation_list, failed_regexes = [], [], [], [], [], [], []
dash = '-' * 98
print(dash)
print('{:<28s}{:>12s}{:>12s}{:>12s}{:>12s}{:>18s}'.format("Pattern", "Py_cov (LOC)", "Py_cov (%)", "C_cov (LOC)", "C_cov (%)", "Pass(1)/Fail(0)"))
print(dash)
for pattern, py_min_coverage, c_min_coverage, grammar in to_process:
evaluation_result, py_all_coverage, c_max_coverage, py_percent, c_percent = evaluate(grammar, pattern, py_min_coverage, c_min_coverage)
py_cov_list.append(py_all_coverage)
c_cov_list.append(c_max_coverage)
evaluation_list.append(evaluation_result)
if py_all_coverage < py_min_coverage:
py_err.append((py_all_coverage - py_min_coverage ))
failed_regexes.append([pattern, "Py"])
else:
py_err.append(0)
if c_max_coverage < c_min_coverage:
c_err.append((c_max_coverage - c_min_coverage ))
failed_regexes.append([pattern, "C"])
else:
c_err.append(0)
print('{:<28s}{:>12d}{:>12s}{:>12d}{:>12s}{:>12b}'.format(pattern, py_all_coverage, py_percent, c_max_coverage, c_percent, c_max_coverage >= c_min_coverage and py_all_coverage >= py_min_coverage))
overall_result.append([pattern, evaluation_result])
plot_bar_chart(tuple(py_cov_list), tuple(c_cov_list), tuple(py_err), tuple(c_err), len(to_process))
score = sum(evaluation_list)
print("Score: {0}/{1}\nScore Percentage: {2}%".format(score,len(evaluation_list), ((score*100)/len(evaluation_list))))
if failed_regexes:
print("The following regexes failed the coverage baseline: ", failed_regexes)
compute_results()
-------------------------------------------------------------------------------------------------- Pattern Py_cov (LOC) Py_cov (%) C_cov (LOC) C_cov (%) Pass(1)/Fail(0) -------------------------------------------------------------------------------------------------- ^ab$ 241 8% 988 7% 1 b$ 170 6% 949 6% 1 ab*$ 207 7% 1142 8% 1 a|b$ 308 11% 1024 7% 1 a(b|c)$ 321 11% 1198 8% 1 (a|b)\s$ 334 12% 1131 8% 1 ^a*(b+|c)d?$ 413 15% 1442 10% 1 ^a*(b+|c)d?\se$ 455 16% 1632 11% 1 ^a+(b*|c)d?\se$ 426 15% 1564 11% 1 ^[0-9]+\s[a-z]*\s?[A-Z]+$ 303 11% 1228 9% 1 \w$ 169 6% 878 6% 1 \d$ 169 6% 875 6% 1 (a|b){2,5}$ 401 14% 1380 10% 1 [0-9a-zA-Z]$ 252 9% 902 6% 1 [0-9]$ 222 8% 852 6% 1 [a-z]$ 222 8% 852 6% 1 [^a-zA-Z] 213 7% 879 6% 1 [^0-9] 178 6% 825 6% 1 \D$ 168 6% 877 6% 1 \W 122 4% 852 6% 1 \S 122 4% 852 6% 1 [a-zA-Z]\s\w+\s\d\s$ 324 11% 1295 9% 1 a(bc){2,5}$ 297 10% 1472 10% 1 \babc\b$ 258 9% 865 6% 1 abc\B 177 6% 987 7% 1 \babc\B 247 9% 928 6% 1
Score: 26/26 Score Percentage: 100.0%