Testing Graphical User Interfaces

In this chapter, we explore how to generate tests for Graphical User Interfaces (GUIs), abstracting from our previous examples on Web testing. Building on general means to extract user interface elements and to activate them, our techniques generalize to arbitrary graphical user interfaces, from rich Web applications to mobile apps, and systematically explore user interfaces through forms and navigation elements.

Prerequisites

Synopsis

To use the code provided in this chapter, write

>>> from fuzzingbook.GUIFuzzer import <identifier>

and then make use of the following features.

This chapter demonstrates how to programmatically interact with user interfaces, using Selenium on Web browsers. It provides an experimental GUICoverageFuzzer class that automatically explores a user interface by systematically interacting with all available user interface elements.

The function start_webdriver() starts a headless Web browser in the background and returns a GUI driver as handle for further communication.

>>> gui_driver = start_webdriver()

We let it the browser open the URL of the server we want to investigate (in this case, the vulnerable server from the chapter on Web fuzzing) and obtain a screen shot.

>>> gui_driver.get(httpd_url)
>>> Image(gui_driver.get_screenshot_as_png())

The GUICoverageFuzzer class explores the user interface and builds a grammar that encodes all states as well as the user interactions required to move from one state to the next. It is paired with a GUIRunner which interacts with the GUI driver.

>>> gui_fuzzer = GUICoverageFuzzer(gui_driver)
>>> gui_runner = GUIRunner(gui_driver)

The explore_all() method extracts all states and all transitions from a Web user interface.

>>> gui_fuzzer.explore_all(gui_runner)

The grammar embeds a finite state automation and is best visualized as such.

>>> fsm_diagram(gui_fuzzer.grammar)

The GUI Fuzzer fuzz() method produces sequences of interactions that follow paths through the finite state machine. Since GUICoverageFuzzer is derived from CoverageFuzzer (see the chapter on coverage-based grammar fuzzing), it automatically covers (a) as many transitions between states as well as (b) as many form elements as possible. In our case, the first set of actions explores the transition via the "order form" link; the second set then goes until the "" state.

>>> gui_driver.get(httpd_url)
>>> actions = gui_fuzzer.fuzz()
>>> print(actions)
fill('city', 'c')
fill('zip', '3')
fill('name', 'a')
check('terms', True)
fill('email', '[email protected]')
submit('submit')
click('order form')
fill('city', 'l')
fill('zip', '7')
fill('name', 'Z')
check('terms', False)
fill('email', '[email protected]')
submit('submit')

These actions can be fed into the GUI runner, which will execute them on the given GUI driver.

>>> gui_driver.get(httpd_url)
>>> result, outcome = gui_runner.run(actions)
>>> Image(gui_driver.get_screenshot_as_png())

Further invocations of fuzz() will further cover the model – for instance, exploring the terms and conditions.

A tool like GUICoverageFuzzer will provide "deep" exploration of user interfaces, even filling out forms to explore what is behind them. Keep in mind, though, that GUICoverageFuzzer is experimental: It only supports a subset of HTML form and link features, and does not take JavaScript into account.

Automated GUI Interaction

In the chapter on Web testing, we have shown how to test Web-based interfaces by directly interacting with a Web server using the HTTP protocol, and processing the retrieved HTML pages to identify user interface elements. While these techniques work well for user interfaces that are based on HTML only, they fail as soon as there are interactive elements that use JavaScript to execute code within the browser, and generate and change the user interface without having to interact with the browser.

In this chapter, we therefore take a different approach to user interface testing. Rather than using HTTP and HTML as the mechanisms for interaction, we leverage a dedicated UI testing framework, which allows us to

  • query the program under test for available user interface elements, and
  • query the UI elements for how they can be interacted with.

Although we will again illustrate our approach using a Web server, the approach easily generalizes to arbitrary user interfaces. In fact, the UI testing framework we use, Selenium, also comes in variants that run for Android apps.

Our Web Server, Again

As in the chapter on Web testing, we run a Web server that allows us to order products.

In [1]:
import fuzzingbook_utils
In [2]:
from WebFuzzer import init_db, start_httpd, webbrowser, print_httpd_messages, print_url, ORDERS_DB
In [3]:
import html
In [4]:
db = init_db()

This is the address of our web server:

In [5]:
httpd_process, httpd_url = start_httpd()
print_url(httpd_url)

Using webbrowser(), we can retrieve the HTML of the home page, and use HTML() to render it.

In [6]:
from IPython.display import display, Image
In [7]:
from fuzzingbook_utils import HTML, rich_output
In [8]:
HTML(webbrowser(httpd_url))
127.0.0.1 - - [09/Sep/2019 17:01:00] "GET / HTTP/1.1" 200 -
Out[8]:
Fuzzingbook Swag Order Form

Yes! Please send me at your earliest convenience


.

Remote Control with Selenium

Let us take a look at the GUI above. In contrast to the chapter on Web testing, we do not assume we can access the HTML source of the current page. All we assume is that there is a set of user interface elements we can interact with.

Selenium is a framework for testing Web applications by automating interaction in the browser. Selenium provides an API that allows one to launch a Web browser, query the state of the user interface, and interact with individual user interface elements. The Selenium API is available in a number of languages; we use the Selenium API for Python.

A Selenium web driver is the interface between a program and a browser controlled by the program.

In [9]:
from selenium import webdriver

The following code starts a Firefox browser in the background, which we then control through the web driver.

In [10]:
BROWSER = 'firefox'

Note: If you don't have Firefox installed, you can also set BROWSER to 'chrome' to use Google Chrome instead.

In [11]:
# BROWSER = 'chrome'

The browser is headless, meaning that it does not show on the screen.

In [12]:
HEADLESS = True

Note: If the notebook server runs locally (i.e. on the same machine on which you are seeing this), you can also set HEADLESS to False and see what happens right on the screen as you execute the notebook cells. This is very much recommended for interactive sessions.

In [13]:
def start_webdriver(browser=BROWSER, headless=HEADLESS, zoom=1.4):
    if browser == 'firefox':
        options = webdriver.FirefoxOptions()
    if browser == 'chrome':
        options = webdriver.ChromeOptions()

    if headless and browser == 'chrome':
        options.add_argument('headless')
    else:
        options.headless = headless
    
    # Start the browser, and obtain a _web driver_ object such that we can interact with it.
    if browser == 'firefox':
        # For firefox, set a higher resolution for our screenshots
        profile = webdriver.firefox.firefox_profile.FirefoxProfile()
        profile.set_preference("layout.css.devPixelsPerPx", repr(zoom))
        gui_driver = webdriver.Firefox(firefox_profile=profile, options=options)
        
        # We set the window size such that it fits our order form exactly;
        # this is useful for not wasting too much space when taking screen shots.
        gui_driver.set_window_size(700, 300)

    elif browser == 'chrome':
        gui_driver = webdriver.Chrome(options=options)
        gui_driver.set_window_size(700, 210 if headless else 340)
            
    return gui_driver
In [14]:
gui_driver = start_webdriver(browser=BROWSER, headless=HEADLESS)

We can now interact with the browser programmatically. First, we have it navigate to the URL of our Web server:

In [15]:
gui_driver.get(httpd_url)

We see that the home page is actually accessed, together with a (failing) request to get a page icon:

In [16]:
print_httpd_messages()
127.0.0.1 - - [09/Sep/2019 17:01:17] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [09/Sep/2019 17:01:17] "GET /favicon.ico HTTP/1.1" 404 -

To see what the "headless" browser displays, we can obtain a screenshot. We see that it actually displays the home page.

In [17]:
Image(gui_driver.get_screenshot_as_png())
Out[17]:

Filling out Forms

To interact with the Web page through Selenium and the browser, we can query Selenium for individual elements. For instance, we can access the UI element whose name attribute (as defined in HTML) is "name".

In [18]:
name = gui_driver.find_element_by_name("name")

Once we have an element, we can interact with it. Since name is a text field, we can send it a string using the send_keys() method; the string will be translated into appropriate key strokes.

In [19]:
name.send_keys("Jane Doe")

In the screenshot, we can see that the name field is now filled:

In [20]:
Image(gui_driver.get_screenshot_as_png())
Out[20]:

In a similar fashion, we can fill out the email, city, and ZIP fields:

In [21]:
email = gui_driver.find_element_by_name("email")
email.send_keys("[email protected]")
In [22]:
city = gui_driver.find_element_by_name('city')
city.send_keys("Seattle")
In [23]:
zip = gui_driver.find_element_by_name('zip')
zip.send_keys("98104")
In [24]:
Image(gui_driver.get_screenshot_as_png())
Out[24]:

The check box for terms and conditions is not filled out, but clicked instead using the click() method.

In [25]:
terms = gui_driver.find_element_by_name('terms')
terms.click()
In [26]:
Image(gui_driver.get_screenshot_as_png())
Out[26]:

The form is now fully filled out. By clicking on the submit button, we can place the order:

In [27]:
submit = gui_driver.find_element_by_name('submit')
submit.click()

We see that the order is being processed, and that the Web browser has switched to the confirmation page.

In [28]:
print_httpd_messages()
127.0.0.1 - - [09/Sep/2019 17:01:18] INSERT INTO orders VALUES ('tshirt', 'Jane Doe', '[email protected]', 'Seattle', '98104')
127.0.0.1 - - [09/Sep/2019 17:01:18] "GET /order?item=tshirt&name=Jane+Doe&email=j.doe%40example.com&city=Seattle&zip=98104&terms=on&submit=Place+order HTTP/1.1" 200 -
In [29]:
Image(gui_driver.get_screenshot_as_png())
Out[29]:

Just as we fill out forms, we can also navigate through a Web site by clicking on links. Let us go back to the home page:

In [30]:
gui_driver.back()
In [31]:
Image(gui_driver.get_screenshot_as_png())
Out[31]:

We can query the web driver for all elements of a particular type. Querying for HTML anchor elements (<a>) for instance, gives us all links on a page.

In [32]:
links = gui_driver.find_elements_by_tag_name("a")

We can query the attributes of UI elements – for instance, the URL the first anchor on the page links to:

In [33]:
links[0].get_attribute('href')
Out[33]:
'http://127.0.0.1:8801/terms'

What happens if we click on it? Very simple: We switch to the Web page being referenced.

In [34]:
links[0].click()
In [35]:
print_httpd_messages()
127.0.0.1 - - [09/Sep/2019 17:01:19] "GET /terms HTTP/1.1" 200 -
In [36]:
Image(gui_driver.get_screenshot_as_png())
Out[36]:

Okay. Let's get back to our home page again.

In [37]:
gui_driver.back()
In [38]:
print_httpd_messages()
In [39]:
Image(gui_driver.get_screenshot_as_png())
Out[39]:

Writing Test Cases

The above calls, interacting with a user interface automatically, are typically used in Selenium tests – that is, code snippets that interact with a Web site, occasionally checking whether everything works as expected. The following code, for instance, places an order just as above. It then retrieves the title element and checks whether the title contains a "Thank you" message, indicating success.

In [40]:
def test_successful_order(driver, url):
    name = "Walter White"
    email = "[email protected]"
    city = "Albuquerque"
    zip_code = "87101"
    
    driver.get(url)
    driver.find_element_by_name("name").send_keys(name)
    driver.find_element_by_name("email").send_keys(email)
    driver.find_element_by_name('city').send_keys(city)
    driver.find_element_by_name('zip').send_keys(zip_code)
    driver.find_element_by_name('terms').click()
    driver.find_element_by_name('submit').click()
    
    title = driver.find_element_by_id('title')
    assert title is not None
    assert title.text.find("Thank you") >= 0

    confirmation = driver.find_element_by_id("confirmation")
    assert confirmation is not None

    assert confirmation.text.find(name) >= 0
    assert confirmation.text.find(email) >= 0
    assert confirmation.text.find(city) >= 0
    assert confirmation.text.find(zip_code) >= 0
    
    return True
In [41]:
test_successful_order(gui_driver, httpd_url)
Out[41]:
True

In a similar vein, we can set up automated test cases for unsuccessful orders, canceling orders, changing orders, and many more. All these test cases would be automatically run after any change to the program code, ensuring the Web application still works.

Of course, writing such tests is quite some effort. Hence, in the remainder of this chapter, we will again explore how to automatically generate them.

Retrieving User Interface Actions

To automatically interact with a user interface, we first need to find out which elements there are, and which user interactions (or short actions) they support.

User Interface Elements

We start with finding available user elements. Let us get back to the order form.

In [42]:
gui_driver.get(httpd_url)
In [43]:
Image(gui_driver.get_screenshot_as_png())
Out[43]:

Using find_elements_by_tag_name() (and other similar find_elements_...() functions), we can retrieve all elements of a particular type, such as HTML input elements.

In [44]:
ui_elements = gui_driver.find_elements_by_tag_name("input")

For each element, we can retrieve its HTML attributes, using get_attribute(). We can thus retrieve the name and type of each input element (if defined).

In [45]:
for element in ui_elements:
    print("Name: %-10s | Type: %-10s | Text: %s" % (element.get_attribute('name'), element.get_attribute('type'), element.text))
Name: name       | Type: text       | Text: 
Name: email      | Type: email      | Text: 
Name: city       | Type: text       | Text: 
Name: zip        | Type: number     | Text: 
Name: terms      | Type: checkbox   | Text: 
Name: submit     | Type: submit     | Text: 
In [46]:
ui_elements = gui_driver.find_elements_by_tag_name("a")
In [47]:
for element in ui_elements:
    print("Name: %-10s | Type: %-10s | Text: %s" % (element.get_attribute('name'), element.get_attribute('type'), element.text))
Name:            | Type:            | Text: terms and conditions

User Interface Actions

Similarly to what we did in the chapter on Web fuzzing, our idea is now to mine a grammar for the user interface – first for an individual user interface page (i.e., a single Web page), later for all pages offered by the application. The idea is that a grammar defines legal sequences of actions – clicks and keystrokes – that can be applied on the application.

We assume the following actions:

  1. fill(<name>, <text>) – fill the UI input element named <name> with the text <text>.
  2. check(<name>, <value>) – set the UI checkbox <name> to the given value <value> (True or False)
  3. submit(<name>) – submit the form by clicking on the UI element <name>.
  4. click(<name>) – click on the UI element <name>, typically for following a link.

This sequence of actions, for instance would fill out the order form:

fill('name', "Walter White")
fill('email', "[email protected]")
fill('city', "Albuquerque")
fill('zip', "87101")
check('terms', True)
submit('submit')

Our set of actions is deliberately defined to be small – for real user interfaces, one would also have to define interactions such as swipes, double clicks, long clicks, right button clicks, modifier keys, and more. Selenium supports all of this; but in the interest of simplicity, we focus on the most important set of interactions.

Retrieving Actions

As a first step in mining an action grammar, we need to be able to retrieve possible interactions. We introduce a class GUIGrammarMiner, which is set to do precisely that.

In [48]:
class GUIGrammarMiner(object):
    def __init__(self, driver, stay_on_host=True):
        self.driver = driver
        self.stay_on_host = stay_on_host
        self.grammar = {}

Our first task is to obtain the set of possible interactions. Given a single UI page, the method mine_input_actions() of GUIGrammarMiner returns a set of actions as defined above. It first gets all input elements, followed by button elements, finally followed by links (a elements), and merges them into a set. (We use a frozenset here since we want to use the set as an index later.)

In [49]:
class GUIGrammarMiner(GUIGrammarMiner):
    def mine_state_actions(self):
        return frozenset(self.mine_input_element_actions()
            | self.mine_button_element_actions()
            | self.mine_a_element_actions())

Input Element Actions

Mining input actions goes through the set of input elements, and returns an action depending on the input type. If the input field is a text, for instance, the associated action is fill(); for checkboxes, the action is check().

The respective values are placeholders depending on the type; if the input field is a number, for instance, the value becomes <number>. As these actions later become part of the grammar, they will be expanded into actual values during grammar expansion.

In [50]:
from selenium.common.exceptions import StaleElementReferenceException
In [51]:
class GUIGrammarMiner(GUIGrammarMiner):
    def mine_input_element_actions(self):
        actions = set()
        
        for elem in self.driver.find_elements_by_tag_name("input"):
            try:
                input_type = elem.get_attribute("type")
                input_name = elem.get_attribute("name")
                if input_name is None:
                    input_name = elem.text

                if input_type in ["checkbox", "radio"]:
                    actions.add("check('%s', <boolean>)" % html.escape(input_name))
                elif input_type in ["text", "number", "email", "password"]:
                    actions.add("fill('%s', '<%s>')" % (html.escape(input_name), html.escape(input_type)))
                elif input_type in ["button", "submit"]:
                    actions.add("submit('%s')" % html.escape(input_name))
                elif input_type in ["hidden"]:
                    pass
                else:
                    # TODO: Handle more types here
                    actions.add("fill('%s', <%s>)" % (html.escape(input_name), html.escape(input_type)))
            except StaleElementReferenceException:
                pass

        return actions

Applied on our order form, we see that the method gets us all input actions:

In [52]:
gui_grammar_miner = GUIGrammarMiner(gui_driver)
gui_grammar_miner.mine_input_element_actions()
Out[52]:
{"check('terms', <boolean>)",
 "fill('city', '<text>')",
 "fill('email', '<email>')",
 "fill('name', '<text>')",
 "fill('zip', '<number>')",
 "submit('submit')"}

Button Element Actions

Mining buttons works in a similar way:

In [53]:
class GUIGrammarMiner(GUIGrammarMiner):
    def mine_button_element_actions(self):
        actions = set()
        
        for elem in self.driver.find_elements_by_tag_name("button"):
            try:
                button_type = elem.get_attribute("type")
                button_name = elem.get_attribute("name")
                if button_name is None:
                    button_name = elem.text
                if button_type == "submit":
                    actions.add("submit('%s')" % html.escape(button_name))
                elif button_type != "reset":
                    actions.add("click('%s')" % html.escape(button_name))
            except StaleElementReferenceException:
                pass

        return actions

Our order form has no button elements. (The submit button is an input element, and was handled above).

In [54]:
gui_grammar_miner = GUIGrammarMiner(gui_driver)
gui_grammar_miner.mine_button_element_actions()
Out[54]:
set()

When following links, we need to make sure that we stay on the current host – we want to explore a single Web site only, not all of the Internet. To this end, we check the href attribute of the link to check whether it still points to the same host. If it does not, we give it a special action ignore(), which, as the name suggests, will later be ignored as it comes to executing these actions. We still return an action, though, as we use the set of actions to characterize a state in the application.

In [55]:
class GUIGrammarMiner(GUIGrammarMiner):
    def mine_a_element_actions(self):
        actions = set()

        for elem in self.driver.find_elements_by_tag_name("a"):
            try:
                a_href = elem.get_attribute("href")
                if a_href is not None:
                    if self.follow_link(a_href):
                        actions.add("click('%s')" % html.escape(elem.text))
                    else:
                        actions.add("ignore('%s')" % html.escape(elem.text))
            except StaleElementReferenceException:
                pass

        return actions

To check whether we can follow a link, the method follow_link() checks the URL:

In [56]:
from urllib.parse import urljoin, urlsplit
In [57]:
class GUIGrammarMiner(GUIGrammarMiner):
    def follow_link(self, link):
        if not self.stay_on_host:
            return True
        
        current_url = self.driver.current_url
        target_url = urljoin(current_url, link)
        return urlsplit(current_url).hostname == urlsplit(target_url).hostname       

In our application, we would not be allowed to follow a link to foo.bar:

In [58]:
gui_grammar_miner = GUIGrammarMiner(gui_driver)
In [59]:
gui_grammar_miner.follow_link("ftp://foo.bar/")
Out[59]:
False

Following a link to localhost, though, works well:

In [60]:
gui_grammar_miner.follow_link("https://127.0.0.1/")
Out[60]:
True

When adapting this for other user interfaces, similar measures would be taken to ensure we stay in the same application.

Running this method on our page gets us the set of links:

In [61]:
gui_grammar_miner = GUIGrammarMiner(gui_driver)
gui_grammar_miner.mine_a_element_actions()
Out[61]:
{"click('terms and conditions')"}

All Together

Let us now apply mine_state_actions() on our current page to retrieve all elements. We see that we get the union of all three sets.

In [62]:
gui_grammar_miner = GUIGrammarMiner(gui_driver)
gui_grammar_miner.mine_state_actions()
Out[62]:
frozenset({"check('terms', <boolean>)",
           "click('terms and conditions')",
           "fill('city', '<text>')",
           "fill('email', '<email>')",
           "fill('name', '<text>')",
           "fill('zip', '<number>')",
           "submit('submit')"})

We assume that we can identify a user interface state from the set of interactive elements it contains – that is, the current Web page is identified by the set above. This is in contrast to Web fuzzing, where we assumed the URL to uniquely characterize a page – but with JavaScript, the URL can stay unchanged although the page contents change, and UIs other than the Web may have no concept of unique URLs. Therefore, we say that the way a UI can be interacted with uniquely defines its state.

Models for User Interfaces

User Interfaces as Finite State Machines

Now that we can retrieve UI elements from a page, let us go and systematically explore a user interface. The idea is to represent the user interface as a finite state machine – that is, a sequence of states that can be reached by interacting with the individual user interface elements.

Let us illustrate such a finite state machine by looking our Web server. The following diagram shows the states our server can be in:

In [63]:
from graphviz import Digraph
In [64]:
from IPython.display import display
In [65]:
from GrammarFuzzer import dot_escape
In [66]:
dot = Digraph(comment="Finite State Machine")
dot.node(dot_escape('<start>'))
dot.edge(dot_escape('<start>'), dot_escape('<Order Form>'))
dot.edge(dot_escape('<Order Form>'), dot_escape('<Terms and Conditions>'), "click('Terms and conditions')")
dot.edge(dot_escape('<Order Form>'), dot_escape('<Thank You>'), "fill(...)\lsubmit('submit')")
dot.edge(dot_escape('<Terms and Conditions>'), dot_escape('<Order Form>'), "click('order form')")
dot.edge(dot_escape('<Thank You>'), dot_escape('<Order Form>'), "click('order form')")
display(dot)
%3 \<start\> <start> \<Order Form\> <Order Form> \<start\>->\<Order Form\> \<Terms and Conditions\> <Terms and Conditions> \<Order Form\>->\<Terms and Conditions\> click('Terms and conditions') \<Thank You\> <Thank You> \<Order Form\>->\<Thank You\> fill(...) submit('submit') \<Terms and Conditions\>->\<Order Form\> click('order form') \<Thank You\>->\<Order Form\> click('order form')

Initially, we are in the <Order Form> state. From here, we can click on Terms and Conditions, and we'll be in the Terms and Conditions state, showing the page with the same title. We can also fill out the form and place the order, having us end in the Thank You state (again showing the page with the same title). From both <Terms and Conditions> and <Thank You>, we can return to the order form by clicking on the order form link.

State Machines as Grammars

To systematically explore a user interface, we must retrieve its finite state machine, and eventually cover all states and transitions. In the presence of forms, such an exploration is difficult, as we need a special mechanism to fill out forms and submit the values to get to the next state. There is a trick, though, which allows us to have a single representation for both states and (form) values. We can embed the finite state machine into a grammar, which is then used for both states and form values.

To embed a finite state machine into a grammar, we proceed as follows:

  1. Every state $\langle s \rangle$ in the finite state machine becomes a symbol $\langle s \rangle$ in the grammar.
  2. Every transition in the finite state machine from $\langle s \rangle$ to $\langle t \rangle$ and actions $a_1, a_2, \dots$ becomes an alternative of $\langle s \rangle$ in the form $a_1, a_2, dots$ $\langle t \rangle$ in the grammar.

The above finite state machine thus gets encoded into the grammar

<start> ::= <Order Form>
<Order Form> ::= click('Terms and Conditions') <Terms and Conditions> | 
                 fill(...) submit('submit') <Thank You>
<Terms and Conditions> ::= click('order form') <Order Form>
<Thank You> ::= click('order form') <Order Form>

Expanding this grammar gets us a stream of actions, navigating through the user interface:

fill(...) submit('submit') click('order form') click('Terms and Conditions') click('order form') ...

This stream is actually infinite (as one can interact with the UI forever); to have it end, one can introduce an alternative <end> that simply expands to the empty string, without having any expansion (state) follow.

Retrieving State Grammars

Let us build a method that retrieves a grammar from the current state of a user interface. We first define a constant GUI_GRAMMAR that servers as template for all sorts of input types. We will use this to fill out forms.

\todo{}: Have a generic interface BaseGrammarMiner with __init__() and mine_grammar()

In [67]:
from Grammars import new_symbol
In [68]:
from Grammars import nonterminals, START_SYMBOL
from Grammars import extend_grammar, unreachable_nonterminals, opts, crange, srange
from Grammars import syntax_diagram, is_valid_grammar
In [69]:
class GUIGrammarMiner(GUIGrammarMiner):
    START_STATE = "<state>"
    UNEXPLORED_STATE = "<unexplored>"
    FINAL_STATE = "<end>"

    GUI_GRAMMAR = ({
        START_SYMBOL: [START_STATE],
        UNEXPLORED_STATE: [""],
        FINAL_STATE: [""],

        "<text>": ["<string>"],
        "<string>": ["<character>", "<string><character>"],
        "<character>": ["<letter>", "<digit>", "<special>"],
        "<letter>": crange('a', 'z') + crange('A', 'Z'),
        
        "<number>": ["<digits>"],
        "<digits>": ["<digit>", "<digits><digit>"],
        "<digit>": crange('0', '9'),
        
        "<special>": srange(". !"),

        "<email>": ["<letters>@<letters>"],
        "<letters>": ["<letter>", "<letters><letter>"],
        
        "<boolean>": ["True", "False"],

        # Use a fixed password in case we need to repeat it
        "<password>": ["abcABC.123"],
        
        "<hidden>": "<string>",
    })
In [70]:
syntax_diagram(GUIGrammarMiner.GUI_GRAMMAR)
start
state
unexplored
end
text
string
string
character string character
character
letter digit special
letter
e d c b a f g h i j o n m l k p q r s t y x w v u z A B C D I H G F E J K L M N S R Q P O T U V W X Y Z
number
digits
digits
digit digits digit
digit
0 1 2 3 4 5 6 7 8 9
special
. !
email
letters @ letters
letters
letter letters letter
boolean
True False
password
abcABC.123
hidden
< s t r i n g >

The method mine_state_grammar() goes through the actions mined from the page (using mine_state_actions()) and creates a grammar for the current state. For each click() and submit() action, it assumes a new state follows, and introduces an appropriate state symbol into the grammar – a state symbol that now will be marked as <unexplored>, but will be expanded later as the appropriate state is seen.

In [71]:
class GUIGrammarMiner(GUIGrammarMiner):
    def new_state_symbol(self, grammar):
        return new_symbol(grammar, self.START_STATE)

    def mine_state_grammar(self, grammar={}, state_symbol=None):
        grammar = extend_grammar(self.GUI_GRAMMAR, grammar)

        if state_symbol is None:
            state_symbol = self.new_state_symbol(grammar)
            grammar[state_symbol] = []

        alternatives = []
        form = ""
        submit = None

        for action in self.mine_state_actions():
            if action.startswith("submit"):
                submit = action
                
            elif action.startswith("click"):
                link_target = self.new_state_symbol(grammar)
                grammar[link_target] = [self.UNEXPLORED_STATE]
                alternatives.append(action + '\n' + link_target)
                
            elif action.startswith("ignore"):
                pass

            else:  # fill(), check() actions
                if len(form) > 0:
                    form += '\n'
                form += action

        if submit is not None:
            if len(form) > 0:
                form += '\n'
            form += submit

        if len(form) > 0:
            form_target = self.new_state_symbol(grammar)
            grammar[form_target] = [self.UNEXPLORED_STATE]
            alternatives.append(form + '\n' + form_target)
            
        alternatives += [self.FINAL_STATE]

        grammar[state_symbol] = alternatives
        
        # Remove unused parts
        for nonterminal in unreachable_nonterminals(grammar):
            del grammar[nonterminal]

        assert is_valid_grammar(grammar)
        
        return grammar

Let us show mine_state_grammar() in action. Here's the grammar for the home page:

In [72]:
gui_grammar_miner = GUIGrammarMiner(gui_driver)
state_grammar = gui_grammar_miner.mine_state_grammar()

From the start state (<state>), we can go and either click on "terms and conditions", ending in <state-1>, or fill out the form, ending in <state-2>.

In [73]:
state_grammar[GUIGrammarMiner.START_STATE]
Out[73]:
["click('terms and conditions')\n<state-1>",
 "fill('city', '<text>')\nfill('zip', '<number>')\nfill('name', '<text>')\ncheck('terms', <boolean>)\nfill('email', '<email>')\nsubmit('submit')\n<state-2>",
 '<end>']

Both these states are yet unexplored:

In [74]:
state_grammar['<state-1>']
Out[74]:
['<unexplored>']
In [75]:
state_grammar['<state-2>']
Out[75]:
['<unexplored>']
In [76]:
state_grammar['<unexplored>']
Out[76]:
['']

To better see the state structure, the function fsm_diagram() shows the resulting state grammar as a finite state machine. (This assumes that the grammar actually encodes a state machine.)

In [77]:
from collections import deque
In [78]:
from fuzzingbook_utils import unicode_escape
In [79]:
def fsm_diagram(grammar, start_symbol=START_SYMBOL):
    def left_align(label):
        return dot_escape(label.replace('\n', r'\l')).replace(r'\\l', '\\l')

    dot = Digraph(comment="Grammar as Finite State Machine")

    symbols = deque([start_symbol])
    symbols_seen = set()
    
    while len(symbols) > 0:
        symbol = symbols.popleft()
        symbols_seen.add(symbol)
        dot.node(symbol, dot_escape(unicode_escape(symbol)))
        
        for expansion in grammar[symbol]:
            nts = nonterminals(expansion)
            if len(nts) > 0:
                target_symbol = nts[-1]
                if target_symbol not in symbols_seen:
                    symbols.append(target_symbol)

                label = expansion.replace(target_symbol, '')
                dot.edge(symbol, target_symbol, left_align(unicode_escape(label)))
                         
                
    return display(dot)

Here's our current grammar as a state machine. We see that it nicely reflects what we can see from our Web server's home page:

In [80]:
fsm_diagram(state_grammar)
%3 start <start> state <state> start->state state-1 <state-1> state->state-1 click('terms and conditions') state-2 <state-2> state->state-2 fill('city', '<text>') fill('zip', '<number>') fill('name', '<text>') check('terms', <boolean>) fill('email', '<email>') submit('submit') end <end> state->end unexplored <unexplored> state-1->unexplored state-2->unexplored

Given the grammar, we can use any of our grammar fuzzers to create valid input sequences:

In [81]:
from GrammarFuzzer import GrammarFuzzer
In [82]:
gui_fuzzer = GrammarFuzzer(state_grammar)
while True:
    action = gui_fuzzer.fuzz()
    if action.find('submit(') > 0:
        break
print(action)
fill('city', '.')
fill('zip', '66')
fill('name', '.')
check('terms', True)
fill('email', '[email protected]')
submit('submit')

These actions, however, must also be executed such that we can explore the user interface. This is what we do in the next section.

Executing User Interface Actions

To execute actions, we introduce a Runner class, conveniently named GUIRunner. The aim of its run() method is to execute the actions as given in an action string. The way we do this is fairly simple: We introduce four methods named fill(), check(), submit() and click(), and run exec() on the action string to have the Python interpreter invoke these methods.

Running exec() on third-party input is dangerous, as the names of UI elements may contain valid Python code. We restrict access to the four functions defined above, and also set __builtins__ to the empty dictionary such that built-in Python functions are not available during exec(). This will prevent accidents, but as we will see in the chapter on information flow, it is still possible to inject Python code. To prevent such injection attacks, we use html.escape() to quote angle and quote characters in all third-party strings.

In [83]:
from Fuzzer import Runner
In [84]:
class GUIRunner(Runner):
    def __init__(self, driver):
        self.driver = driver
        
    def run(self, inp):
        def fill(name, value):
            self.do_fill(html.unescape(name), html.unescape(value))
        def check(name, state):
            self.do_check(html.unescape(name), state)
        def submit(name):
            self.do_submit(html.unescape(name))
        def click(name):
            self.do_click(html.unescape(name))
        
        exec(inp, {'__builtins__': {}},
                  {'fill': fill, 'check': check, 'submit': submit, 'click': click})
        
        return inp, self.PASS

To identify elements in an action, we first search them by their name, and then by the displayed link text.

In [85]:
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import ElementClickInterceptedException, ElementNotInteractableException
In [86]:
class GUIRunner(GUIRunner):
    def find_element(self, name):
        try:
            return self.driver.find_element_by_name(name)
        except NoSuchElementException:
            return self.driver.find_element_by_link_text(name)

The implementations of the actions simply defer to the appropriate Selenium methods, introducing explicit delays such that the page can reload and refresh.

In [87]:
from selenium.webdriver.support.ui import WebDriverWait
In [88]:
class GUIRunner(GUIRunner):
    # Delays (in seconds)
    DELAY_AFTER_FILL = 0.1
    DELAY_AFTER_CHECK = 0.1
    DELAY_AFTER_SUBMIT = 1
    DELAY_AFTER_CLICK = 1
In [89]:
class GUIRunner(GUIRunner):
    def do_fill(self, name, value):
        element = self.find_element(name)
        element.send_keys(value)
        WebDriverWait(self.driver, self.DELAY_AFTER_FILL)
In [90]:
class GUIRunner(GUIRunner):
    def do_check(self, name, state):
        element = self.find_element(name)
        if bool(state) != bool(element.is_selected()):
            element.click()
        WebDriverWait(self.driver, self.DELAY_AFTER_CHECK)
In [91]:
class GUIRunner(GUIRunner):
    def do_submit(self, name):
        element = self.find_element(name)
        element.click()
        WebDriverWait(self.driver, self.DELAY_AFTER_SUBMIT)
In [92]:
class GUIRunner(GUIRunner):
    def do_click(self, name):
        element = self.find_element(name)
        element.click()
        WebDriverWait(self.driver, self.DELAY_AFTER_CLICK)

Let us try out GUIRunner. We create a runner on our Web server, and let it execute a fill() action:

In [93]:
gui_driver.get(httpd_url)
In [94]:
gui_runner = GUIRunner(gui_driver)
In [95]:
gui_runner.run("fill('name', 'Walter White')")
Out[95]:
("fill('name', 'Walter White')", 'PASS')
In [96]:
Image(gui_driver.get_screenshot_as_png())
Out[96]:

A submit() action submits the order. (Note that our Web server does no effort whatsoever to validate the form.)

In [97]:
gui_runner.run("submit('submit')")
Out[97]:
("submit('submit')", 'PASS')
In [98]:
Image(gui_driver.get_screenshot_as_png())
Out[98]:

Of course, we can also execute action sequences generated from the grammar. This allows us to fill the form again and again, using values matching the type given in the form.

In [99]:
gui_driver.get(httpd_url)
In [100]:
gui_fuzzer = GrammarFuzzer(state_grammar)
In [101]:
while True:
    action = gui_fuzzer.fuzz()
    if action.find('submit(') > 0:
        break
In [102]:
print(action)
fill('city', '!')
fill('zip', '351')
fill('name', '.')
check('terms', True)
fill('email', '[email protected]')
submit('submit')

In [103]:
gui_runner.run(action)
Out[103]:
("fill('city', '!')\nfill('zip', '351')\nfill('name', '.')\ncheck('terms', True)\nfill('email', '[email protected]')\nsubmit('submit')\n",
 'PASS')
In [104]:
Image(gui_driver.get_screenshot_as_png())
Out[104]:

Exploring User Interfaces

So far, our grammar retrieval and execution of actions is limited to the current user interface state (i.e., the current page shown). To systematically explore a user interface, we must explore all states, notably those ending in <unexplored> – and whenever we reach a new state, again retrieve its grammar such that we may be able to reach other states. Since some states can only be reached by generating inputs, test generation and user interface exploration take place at the same time.

Consequently, we introduce a GUIFuzzer class, which generates inputs for all forms and follows all links, and which updates its grammar (i.e., its user interface model as a finite state machine) every time it encounters a new state.

Exploring states and updating the grammar at the same time is a fairly complex operation, so we need to introduce quite a number of methods before we can put this to use. The GUIFuzzer constructor sets three important attributes:

  1. state_symbol: This holds the symbol of the current state (e.g. <state-1>).
  2. state: This holds the set of actions for the current state, as returned by the GUIGrammarMiner method mine_state_actions().
  3. states_seen: This maps the states seen (as in state) to the respective symbols.

Let us show these three attributes after initialization.

In [105]:
from Grammars import is_nonterminal
In [106]:
from GrammarFuzzer import GrammarFuzzer
In [107]:
class GUIFuzzer(GrammarFuzzer):
    def __init__(self, driver, 
                 stay_on_host=True,
                 log_gui_exploration=False, 
                 disp_gui_exploration=False, 
                 **kwargs):
        self.driver = driver
        self.miner = GUIGrammarMiner(driver)
        self.stay_on_host = True
        self.log_gui_exploration = log_gui_exploration
        self.disp_gui_exploration = disp_gui_exploration
        self.initial_url = driver.current_url

        self.states_seen = {}  # Maps states to symbols
        self.state_symbol = GUIGrammarMiner.START_STATE
        self.state = self.miner.mine_state_actions()
        self.states_seen[self.state] = self.state_symbol
        
        grammar = self.miner.mine_state_grammar()
        super().__init__(grammar, **kwargs)
In [108]:
gui_driver.get(httpd_url)

The initial state symbol is always <state>:

In [109]:
gui_fuzzer = GUIFuzzer(gui_driver)
gui_fuzzer.state_symbol
Out[109]:
'<state>'

The current state is characterized by the available UI actions:

In [110]:
gui_fuzzer.state
Out[110]:
frozenset({"check('terms', <boolean>)",
           "click('terms and conditions')",
           "fill('city', '<text>')",
           "fill('email', '<email>')",
           "fill('name', '<text>')",
           "fill('zip', '<number>')",
           "submit('submit')"})

states_seen maps this state to its symbol:

In [111]:
gui_fuzzer.states_seen[gui_fuzzer.state]
Out[111]:
'<state>'

The restart() method gets us back to the initial URL and resets the state. This is what we use with every new exploration.

In [112]:
class GUIFuzzer(GUIFuzzer):
    def restart(self):
        self.driver.get(self.initial_url)
        self.state = GUIGrammarMiner.START_STATE

When producing a sequence of actions from the grammar, we want to know which final state we are to be in. We can retrieve this path from the derivation tree produced – it is the last symbol being expanded.

In [113]:
while True:
    action = gui_fuzzer.fuzz()
    if action.find('click(') >= 0:
        break
In [114]:
from GrammarFuzzer import display_tree
In [115]:
tree = gui_fuzzer.derivation_tree
display_tree(tree)
Out[115]:
%3 0 <start> 1 <state> 0->1 2 click('terms and conditions') 1->2 3 <state-1> 1->3 4 <unexplored> 3->4 5 4->5
In [116]:
class GUIFuzzer(GUIFuzzer):
    def fsm_path(self, tree):
        """Return sequence of state symbols"""
        (node, children) = tree
        if node == GUIGrammarMiner.UNEXPLORED_STATE:
            return []
        elif children is None or len(children) == 0:
            return [node]
        else:
            return [node] + self.fsm_path(children[-1])

This is the path in the finite state machine towards the "fuzzed" state:

In [117]:
gui_fuzzer = GUIFuzzer(gui_driver)
gui_fuzzer.fsm_path(tree)
Out[117]:
['<start>', '<state>', '<state-1>']

This is its last element:

In [118]:
class GUIFuzzer(GUIFuzzer):
    def fsm_last_state_symbol(self, tree):
        """Return current (expected) state symbol"""
        for state in reversed(self.fsm_path(tree)):
            if is_nonterminal(state):
                return state
        assert False
In [119]:
gui_fuzzer = GUIFuzzer(gui_driver)
gui_fuzzer.fsm_last_state_symbol(tree)
Out[119]:
'<state-1>'

As we run (run()) the fuzzer, we create an action (via fuzz()) and retrieve and update the state symbol (state_symbol) we are supposed to be in after running this action. After actually running the action in the given GUIRunner, we retrieve and update the current state, using update_state().

In [120]:
class GUIFuzzer(GUIFuzzer):
    def run(self, runner):
        assert isinstance(runner, GUIRunner)
        
        self.restart()
        action = self.fuzz()
        self.state_symbol = self.fsm_last_state_symbol(self.derivation_tree)

        if self.log_gui_exploration:
            print("Action", action.strip(), "->", self.state_symbol)

        result, outcome = runner.run(action)
        
        if self.state_symbol != GUIGrammarMiner.FINAL_STATE:
            self.update_state()

        return self.state_symbol, outcome

When updating the current state, we check whether we are in a new or in a previously seen state, and invoke update_new_state() or update_existing_state(), respectively.

In [121]:
class GUIFuzzer(GUIFuzzer):
    def update_state(self):
        if self.disp_gui_exploration:
            display(Image(self.driver.get_screenshot_as_png()))

        self.state = self.miner.mine_state_actions()
        if self.state not in self.states_seen:
            self.states_seen[self.state] = self.state_symbol
            self.update_new_state()
        else:
            self.update_existing_state()

Finding a new state means that we mine a new grammar for the newly found state, and update our existing grammar with it.

In [122]:
class GUIFuzzer(GUIFuzzer):
    def set_grammar(self, new_grammar):
        self.grammar = new_grammar
        
        if self.disp_gui_exploration and rich_output():
            display(fsm_diagram(self.grammar))
In [123]:
class GUIFuzzer(GUIFuzzer):
    def update_new_state(self):
        if self.log_gui_exploration:
            print("In new state", unicode_escape(self.state_symbol), unicode_escape(repr(self.state)))

        state_grammar = self.miner.mine_state_grammar(grammar=self.grammar, 
                                                      state_symbol=self.state_symbol)
        del state_grammar[START_SYMBOL]
        del state_grammar[GUIGrammarMiner.START_STATE]
        self.set_grammar(extend_grammar(self.grammar, state_grammar))

If we find an existing state, we need to merge both states. If, for instance, we find that we are in existing <state-1> rather than in the expected <state-3>, we replace all instances of <state-3> in the grammar by <state-1>. The method replace_symbol() takes care of the renaming; update_existing_state() sets the grammar accordingly.

In [124]:
from Grammars import exp_string, exp_opts
In [125]:
def replace_symbol(grammar, old_symbol, new_symbol):
    """Return a grammar in which all occurrences of `old_symbol` are replaced by `new_symbol`"""
    new_grammar = {}
    
    for symbol in grammar:
        new_expansions = []
        for expansion in grammar[symbol]:
            new_expansion_string = exp_string(expansion).replace(old_symbol, new_symbol)
            if len(exp_opts(expansion)) > 0:
                new_expansion = (new_expansion_string, exp_opts(expansion))
            else:
                new_expansion = new_expansion_string
            new_expansions.append(new_expansion)
                
        new_grammar[symbol] = new_expansions
        
    # Remove unused parts
    for nonterminal in unreachable_nonterminals(new_grammar):
        del new_grammar[nonterminal]

    return new_grammar
In [126]:
class GUIFuzzer(GUIFuzzer):            
    def update_existing_state(self):
        if self.log_gui_exploration:
            print("In existing state", self.states_seen[self.state])

        if self.state_symbol != self.states_seen[self.state]:
            if self.log_gui_exploration:
                print("Replacing expected state %s by %s" %
                      (self.state_symbol, self.states_seen[self.state]))
            
            new_grammar = replace_symbol(self.grammar, self.state_symbol, 
                                         self.states_seen[self.state])
            self.state_symbol = self.states_seen[self.state]
            self.set_grammar(new_grammar)

This concludes our definitions for GUIFuzzer. We can now put it to use, enabling its logging mechanisms to see what it is doing.

In [127]:
gui_driver.get(httpd_url)
In [128]:
gui_fuzzer = GUIFuzzer(gui_driver, log_gui_exploration=True, disp_gui_exploration=True)

Running it the first time yields a new state:

In [129]:
gui_fuzzer.run(gui_runner)
Action fill('city', '0')
fill('zip', '1')
fill('name', ' ')
check('terms', True)
fill('email', '[email protected]')
submit('submit') -> <state-2>
In new state <state-2> frozenset({"click('order form')"})
%3 start <start> state <state> start->state state-1 <state-1> state->state-1 click('terms and conditions') state-2 <state-2> state->state-2 fill('city', '<text>') fill('zip', '<number>') fill('name', '<text>') check('terms', <boolean>) fill('email', '<email>') submit('submit') end <end> state->end unexplored <unexplored> state-1->unexplored state-2->end state-3 <state-3> state-2->state-3 click('order form') state-3->unexplored
None
Out[129]:
('<state-2>', 'PASS')

The next actions fill out the order form.

In [130]:
gui_fuzzer.run(gui_runner)
Action  -> <end>
Out[130]:
('<end>', 'PASS')
In [131]:
gui_fuzzer.run(gui_runner)
Action  -> <end>
Out[131]:
('<end>', 'PASS')

At this point, our GUI model is fairly complete already. In order to systematically cover all states, random exploration is not efficient enough, though.

Covering States

During exploration as well as during testing, we want to cover all states and transitions between states. How can we achieve this?

It turns out that we already have this. Our GrammarCoverageFuzzer from the chapter on coverage-based grammar testing strives to systematically cover all expansion alternatives in a grammar. In the finite state model, these expansion alternatives translate into transitions between states. Hence, applying the coverage strategy from GrammarCoverageFuzzer to our state grammars would automatically cover one transition after another.

How do we get these features into GUIFuzzer? Using multiple inheritance, we can create a class GUICoverageFuzzer which combines the run() method from GUIFuzzer with the coverage choices from GrammarCoverageFuzzer.

In [132]:
from GrammarCoverageFuzzer import GrammarCoverageFuzzer
In [133]:
from fuzzingbook_utils import inheritance_conflicts

Since the __init__() constructor is defined in both superclasses, we need to define our own constructor that serves both:

In [134]:
inheritance_conflicts(GUIFuzzer, GrammarCoverageFuzzer)
Out[134]:
['__init__']
In [135]:
class GUICoverageFuzzer(GUIFuzzer, GrammarCoverageFuzzer):
    def __init__(self, *args, **kwargs):
        GUIFuzzer.__init__(self, *args, **kwargs)
        self.reset_coverage()

With GUICoverageFuzzer, we can set up a method explore_all() that keeps on running the fuzzer until there are no unexplored states anymore:

In [136]:
class GUICoverageFuzzer(GUICoverageFuzzer):            
    def explore_all(self, runner, max_actions=100):
        actions = 0
        while GUIGrammarMiner.UNEXPLORED_STATE in self.grammar and actions < max_actions:
            actions += 1
            if self.log_gui_exploration:
                print("Run #" + repr(actions))
            try:
                self.run(runner)
            except ElementClickInterceptedException:
                pass
            except ElementNotInteractableException:
                pass
            except NoSuchElementException:
                pass

Let us use this to fully explore our Web server:

In [137]:
gui_driver.get(httpd_url)
In [138]:
gui_fuzzer = GUICoverageFuzzer(gui_driver)
In [139]:
gui_fuzzer.explore_all(gui_runner)

Success! We have covered all states:

In [140]:
fsm_diagram(gui_fuzzer.grammar)
%3 start <start> state <state> start->state state-1 <state-1> state->state-1 click('terms and conditions') state-2 <state-2> state->state-2 fill('city', '<text>') fill('zip', '<number>') fill('name', '<text>') check('terms', <boolean>) fill('email', '<email>') submit('submit') end <end> state->end state-1->state click('order form') state-1->end state-2->state click('order form') state-2->end

We can retrieve the expansions covered so far, which of course cover all states.

In [141]:
gui_fuzzer.covered_expansions
Out[141]:
{'<boolean> -> False',
 '<boolean> -> True',
 '<character> -> <digit>',
 '<character> -> <letter>',
 '<character> -> <special>',
 '<digit> -> 0',
 '<digit> -> 1',
 '<digit> -> 2',
 '<digit> -> 4',
 '<digit> -> 5',
 '<digit> -> 6',
 '<digit> -> 9',
 '<digits> -> <digit>',
 '<digits> -> <digits><digit>',
 '<email> -> <letters>@<letters>',
 '<end> -> ',
 '<letter> -> A',
 '<letter> -> B',
 '<letter> -> C',
 '<letter> -> D',
 '<letter> -> E',
 '<letter> -> K',
 '<letter> -> M',
 '<letter> -> N',
 '<letter> -> R',
 '<letter> -> U',
 '<letter> -> Y',
 '<letter> -> a',
 '<letter> -> b',
 '<letter> -> c',
 '<letter> -> d',
 '<letter> -> l',
 '<letter> -> n',
 '<letter> -> t',
 '<letter> -> u',
 '<letter> -> y',
 '<letter> -> z',
 '<letters> -> <letter>',
 '<letters> -> <letters><letter>',
 '<number> -> <digits>',
 '<special> -> !',
 '<start> -> <state>',
 '<state-1> -> <end>',
 '<state-1> -> <unexplored>',
 "<state-1> -> click('order form')\n<state-4>",
 '<state-2> -> <end>',
 '<state-2> -> <unexplored>',
 "<state-2> -> click('order form')\n<state-3>",
 '<state-3> -> <unexplored>',
 '<state-4> -> <unexplored>',
 '<state> -> <end>',
 "<state> -> click('terms and conditions')\n<state-1>",
 "<state> -> fill('city', '<text>')\nfill('zip', '<number>')\nfill('name', '<text>')\ncheck('terms', <boolean>)\nfill('email', '<email>')\nsubmit('submit')\n<state-2>",
 '<string> -> <character>',
 '<string> -> <string><character>',
 '<text> -> <string>',
 '<unexplored> -> '}

Still, we haven't seen all expansions covered. A few digits and letters remain to be used.

In [142]:
gui_fuzzer.missing_expansion_coverage()
Out[142]:
{'<digit> -> 3',
 '<digit> -> 7',
 '<digit> -> 8',
 '<letter> -> F',
 '<letter> -> G',
 '<letter> -> H',
 '<letter> -> I',
 '<letter> -> J',
 '<letter> -> L',
 '<letter> -> O',
 '<letter> -> P',
 '<letter> -> Q',
 '<letter> -> S',
 '<letter> -> T',
 '<letter> -> V',
 '<letter> -> W',
 '<letter> -> X',
 '<letter> -> Z',
 '<letter> -> e',
 '<letter> -> f',
 '<letter> -> g',
 '<letter> -> h',
 '<letter> -> i',
 '<letter> -> j',
 '<letter> -> k',
 '<letter> -> m',
 '<letter> -> o',
 '<letter> -> p',
 '<letter> -> q',
 '<letter> -> r',
 '<letter> -> s',
 '<letter> -> v',
 '<letter> -> w',
 '<letter> -> x',
 '<special> ->  ',
 '<special> -> .',
 "<state-1> -> click('order form')\n<state>",
 "<state-2> -> click('order form')\n<state>"}

Running the fuzzer again and again will eventually cover these expansions too, leading to letter and digit coverage within the order form.

Exploring Large Sites

Our GUI fuzzer is robust enough to handle exploration even on nontrivial sites such as fuzzingbook.org. Let us demonstrate this:

In [143]:
gui_driver.get("https://www.fuzzingbook.org/html/Fuzzer.html")
In [144]:
Image(gui_driver.get_screenshot_as_png())
Out[144]:
In [145]:
book_runner = GUIRunner(gui_driver)
In [146]:
book_fuzzer = GUICoverageFuzzer(gui_driver, log_gui_exploration=True)  # , disp_gui_exploration=True)

We explore the first few states of the site, defined in ACTIONS:

In [147]:
ACTIONS = 5
In [148]:
book_fuzzer.explore_all(book_runner, max_actions=ACTIONS)
Run #1
Action click('A Fuzzing Architecture') -> <state-15>
In existing state <state>
Replacing expected state <state-15> by <state>
Run #2
Action click('Cite') -> <state-14>
Run #3
Action click('ExpectError') -> <state-9>
Run #4
Action click('use grammars to specify the input format and thus get many more valid inputs') -> <state-4>
In new state <state-4> frozenset({"ignore('string')", "click('RailroadDiagrams')", "ignore('Yang et al, 2011.')", "ignore('CSmith')", "ignore('Holler et al, 2012.')", "click('constraints')", "ignore('inspect')", "click('the GrammarFuzzer class')", "ignore('Last change: 2019-05-21 19:57:59+02:00')", "click('Fuzzer')", "ignore('Burkhardt et al, 1967.')", "ignore('random')", "click('Generating Software Tests')", "click('grammar toolbox')", "ignore('re')", "ignore('Domato')", "click('later in this book')", "click('fuzzingbook.Grammars')", "ignore('Wikipedia page on file formats')", "ignore('IPython.display')", "ignore('fuzzingbook_utils')", "click('&quot;Mutation-Based Fuzzing&quot;')", "ignore('sys')", "click('use the code provided in this chapter')", "ignore('Dak\xe1\xb9\xa3iputra P\xc4\x81\xe1\xb9\x87ini, 350 BCE.')", "ignore('LangFuzz')", "click('create an efficient grammar fuzzer')", "click('')", "click('chapter on coverage')", "click('basic fuzzing')", "click('Chapter introducing fuzzing')", "ignore('ast')", "click('probabilistic-based')", "click('Cite')", "ignore('MIT License')", "click('fuzzing functions and APIs')", "click('generator-based')", "ignore('Backus-Naur form')", "click('fuzzing graphical user interfaces')", "ignore('Hanford et al, 1970.')", "ignore('Grammarinator')", "ignore('itertools')", "click('fuzz configurations')", "ignore('Use the notebook')", "click('our chapter on coverage-based fuzzing')", "click('mutation-based fuzzing')", "click('probabilistic grammar fuzzing')", "ignore('JSON specification')", "click('MutationFuzzer')", "ignore('')", "ignore('Chomsky et al, 1956.')", "click('probabilities')", "ignore('Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License')", "click('coverage')", "ignore('Hodov\xc3\xa1n et al, 2018.')", "ignore('copy')", "ignore('EMI Project')", "ignore('Purdom et al, 1972.')", "click('coverage-based')", "ignore('Le et al, 2014.')", "ignore('Imprint')", "click('next chapter')"})
Run #5
Action  -> <end>

After the first ACTIONS actions already, we can see that the finite state model is quite complex, with dozens of transitions still left to explore. Most of the yet unexplored states will eventually merge with existing states, yielding one state per chapter. Still, following all links on all pages will take quite some time.

In [149]:
# Inspect this graph in the notebook to see it in full glory
fsm_diagram(book_fuzzer.grammar)
%3 start <start> state <state> start->state state->state click('A Fuzzing Architecture') state-1 <state-1> state->state-1 click('fuzzingbook.Fuzzer') state-2 <state-2> state->state-2 click('Generating Software Tests') state-3 <state-3> state->state-3 click('chapter on testing') state-4 <state-4> state->state-4 click('use grammars to specify the input format and thus get many more valid inputs') state-5 <state-5> state->state-5 click('runtime verification') state-6 <state-6> state->state-6 click('chapter on information flow') state-7 <state-7> state->state-7 click('use mutations on existing inputs to get more valid inputs') state-8 <state-8> state->state-8 click('use the code provided in this chapter') state-9 <state-9> state->state-9 click('ExpectError') state-10 <state-10> state->state-10 click('Introduction to Testing') state-11 <state-11> state->state-11 click('') state-12 <state-12> state->state-12 click('reduce failing inputs for efficient debugging') state-13 <state-13> state->state-13 click('&quot;Introduction to Software Testing&quot;') state-14 <state-14> state->state-14 click('Cite') state-16 <state-16> state->state-16 click('Intro_Testing') state-17 <state-17> state->state-17 click('chapter on mining function specifications') state-18 <state-18> state->state-18 click('discussed above') end <end> state->end unexplored <unexplored> state-1->unexplored state-2->unexplored state-3->unexplored state-4->end state-15 <state-15> state-4->state-15 click('RailroadDiagrams') state-19 <state-19> state-4->state-19 click('constraints') state-20 <state-20> state-4->state-20 click('the GrammarFuzzer class') state-21 <state-21> state-4->state-21 click('Fuzzer') state-22 <state-22> state-4->state-22 click('Generating Software Tests') state-23 <state-23> state-4->state-23 click('grammar toolbox') state-24 <state-24> state-4->state-24 click('later in this book') state-25 <state-25> state-4->state-25 click('fuzzingbook.Grammars') state-26 <state-26> state-4->state-26 click('&quot;Mutation-Based Fuzzing&quot;') state-27 <state-27> state-4->state-27 click('use the code provided in this chapter') state-28 <state-28> state-4->state-28 click('create an efficient grammar fuzzer') state-29 <state-29> state-4->state-29 click('') state-30 <state-30> state-4->state-30 click('chapter on coverage') state-31 <state-31> state-4->state-31 click('basic fuzzing') state-32 <state-32> state-4->state-32 click('Chapter introducing fuzzing') state-33 <state-33> state-4->state-33 click('probabilistic-based') state-34 <state-34> state-4->state-34 click('Cite') state-35 <state-35> state-4->state-35 click('fuzzing functions and APIs') state-36 <state-36> state-4->state-36 click('generator-based') state-37 <state-37> state-4->state-37 click('fuzzing graphical user interfaces') state-38 <state-38> state-4->state-38 click('fuzz configurations') state-39 <state-39> state-4->state-39 click('our chapter on coverage-based fuzzing') state-40 <state-40> state-4->state-40 click('mutation-based fuzzing') state-41 <state-41> state-4->state-41 click('probabilistic grammar fuzzing') state-42 <state-42> state-4->state-42 click('MutationFuzzer') state-43 <state-43> state-4->state-43 click('probabilities') state-44 <state-44> state-4->state-44 click('coverage') state-45 <state-45> state-4->state-45 click('coverage-based') state-46 <state-46> state-4->state-46 click('next chapter') state-5->unexplored state-6->unexplored state-7->unexplored state-8->unexplored state-9->unexplored state-10->unexplored state-11->unexplored state-12->unexplored state-13->unexplored state-14->unexplored state-16->unexplored state-17->unexplored state-18->unexplored state-15->unexplored state-19->unexplored state-20->unexplored state-21->unexplored state-22->unexplored state-23->unexplored state-24->unexplored state-25->unexplored state-26->unexplored state-27->unexplored state-28->unexplored state-29->unexplored state-30->unexplored state-31->unexplored state-32->unexplored state-33->unexplored state-34->unexplored state-35->unexplored state-36->unexplored state-37->unexplored state-38->unexplored state-39->unexplored state-40->unexplored state-41->unexplored state-42->unexplored state-43->unexplored state-44->unexplored state-45->unexplored state-46->unexplored

We now have all the basic capabilities we need: We can automatically explore large Web sites; we can explore "deep" functionality by filling out forms; and we can have our coverage-based fuzzer automatcially focus on yet unexplored states. Still, there is a lot more one can do; the exercises will give you some ideas.

In [150]:
gui_driver.quit()

Synopsis

This chapter demonstrates how to programmatically interact with user interfaces, using Selenium on Web browsers. It provides an experimental GUICoverageFuzzer class that automatically explores a user interface by systematically interacting with all available user interface elements.

The function start_webdriver() starts a headless Web browser in the background and returns a GUI driver as handle for further communication.

In [151]:
gui_driver = start_webdriver()

We let it the browser open the URL of the server we want to investigate (in this case, the vulnerable server from the chapter on Web fuzzing) and obtain a screen shot.

In [152]:
gui_driver.get(httpd_url)
Image(gui_driver.get_screenshot_as_png())
Out[152]:

The GUICoverageFuzzer class explores the user interface and builds a grammar that encodes all states as well as the user interactions required to move from one state to the next. It is paired with a GUIRunner which interacts with the GUI driver.

In [153]:
gui_fuzzer = GUICoverageFuzzer(gui_driver)
In [154]:
gui_runner = GUIRunner(gui_driver)

The explore_all() method extracts all states and all transitions from a Web user interface.

In [155]:
gui_fuzzer.explore_all(gui_runner)

The grammar embeds a finite state automation and is best visualized as such.