Notebook

Unit Testing Functions Reference¶

Author: Kevin Lituchy¶

Introduction:¶

This module contains in-depth explanations of all the functions in UnitTesting.If you have not already, please read through the Jupyter notebook tutorial for unit testing. This will contain in-depth information on all functions used for unit testing, not a high-level user tutorial. With examples, the default module that will be used is UnitTesting/Test_UnitTesting/test_module.py.

Table of Contents¶

$$\label{toc}$$

This module is organized as follows:

Non-interactive Files
1. failed_tests
2. standard_constants
run_NRPy_UnitTests
create_test
1. setup_trusted_values_dict
2. RepeatedTimer
run_test

`Non-interactive Files` [Back to top]¶

$$\label{non_interactive_files}$$

`failed_tests` [Back to top]¶

$$\label{failed_tests}$$

failed_tests.txt is a simple text file that keeps track of which tests failed. Line 1 is by default 'Failures: '. The subsequent lines tell the user which test functions in which test files failed in the following format: [test file path]: [test function]

Example:

Say that the test function test_module_for_testing_no_gamma() failed. Then we'd expect failed_tests.txt to be the following:

Failures:

UnitTesting/Test_UnitTesting/test_module.py: test_module_for_testing_no_gamma

`standard_constants` [Back to top]¶

$$\label{standard_constants}$$

standard_constants.py stores test-wide information that the user can modify to impact the numerical result for their globals. It currently only has one field, precision, which determines how precise the values for the globals are. It is by default set to 30, which we've determined to be a reasonable amount. This file has the ability to be expanded upon in the future, but it is currently minimal.

`run_NRPy_UnitTests` [Back to top]¶

$$\label{run_NRPy_UnitTests}$$

run_NRPy_UnitTests.sh is a bash script that acts as the hub for running tests -- it's where the user specifies the tests they'd like to be run. It keeps track of which tests failed by interacting with failed_tests, giving the user easily readable output from the file. It also has the option to automatically rerun the tests that failed in DEBUG mode if the boolean rerun_if_fail is true.

The script is run with the following syntax:

./UnitTesting/run_NRPy_UnitTests.sh [python interpreter]

This of course assumes that the user is in the nrpy directory; the user simply has to specify the path from their current directory to the bash file.

Examples of python interpreter are python and python3.

The script first lets the user know if they forgot to pass a python interpreter. Then if they didn't, it prints some baseline information about Python variables: PYTHONPATH, PYTHONEXEC, and PYTHONEXEC version.

failed_tests.txt is then overwritten with the default information. This makes it so that each subsequent test call has a unique list of the tests that passed; it wouldn't make sense to store this information.

The user can then change the boolean rerun_if_fail if need be. Next, the user can add tests using the add_test function. The syntax is as follows: add_test [path to test file]

Example:

add_test UnitTesting/Test_UnitTesting/test_module.py

Finally, the bash script will read any failures from failed_tests.txt and, if rerun_if_fail is true, rerun those tests. It lastly prints which tests failed in the same format as failed_tests.txt, and if no tests failed, a success message.

`create_test` [Back to top]¶

$$\label{create_test}$$

create_test is a function that takes the following user-supplied information:

module, the module to be tested
module_name, the name of the module
function_and_global_dict, a dictionary whose keys are functions and whose values are lists of globals

It uses this information to generate a test file that is automatically run by the bash script; this test file does all the heavy lifting in calling the function, getting expressions for all the globals, evaluating the expressions to numerical values, and storing the values in the proper trusted_values_dict.

create_test additionally takes optional arguments logging_level and initialization_string_dict, which respectively determine the desired level of output (think verbosity) and run some Python code prior calling the specified function. Usage is as following:

module = 'BSSN.BrillLindquist'

module_name = 'BrillLindquist'

function_and_global_dict = {'BrillLindquist(ComputeADMGlobalsOnly = True)': ['alphaCart', 'betaCartU', 'BCartU', 'gammaCartDD', 'KCartDD']}

create_test(module, module_name, function_and_global_dict)

The way to think of this is that the module to be tested is BSSN.BrillLindquist. The module_name is how you refer to this module -- it's a bit arbitrary, so whether you prefer BrillLindquist or bl, it won't change the computation. The function_and_global_dict contains entry 'BrillLindquist(ComputeADMGlobalsOnly = True)', which is the function that gets called in the module. It's value in the dictionary is a list of globals that get created when this function gets called.

Now let's add the optional arguments into the same example:

module = 'BSSN.BrillLindquist'

module_name = 'BrillLindquist'

function_and_global_dict = {'BrillLindquist(ComputeADMGlobalsOnly = True)': ['alphaCart', 'betaCartU', 'BCartU', 'gammaCartDD', 'KCartDD']}

logging_level = 'DEBUG'

initialization_string_dict = {'BrillLindquist(ComputeADMGlobalsOnly = True)': 'print("example")\nprint("Hello world!")'}

create_test(module, module_name, function_and_global_dict, logging_level=logging_level, initialization_string_dict=initialization_string_dict)

Now when create_test runs, the user will be given much more output due to the logging_level; additionally, the user-specified print will occur due to initialization_string_dict.

You may now be wondering why we use dictionaries to store this data instead of simply having separate variables function, global_list, and initialization_string. This is where some of the power of this testing method lies: we can test multiple functions and their globals with ease! In other words, function_and_global_dict can contain multiple entries, each a specific function call with its own associated list of globals. Since not every function being tested must have an associated initialization_string, we make an entry for each function optional. An example is as follows:

module = 'BSSN.BrillLindquist'

module_name = 'BrillLindquist'

function_and_global_dict = {'BrillLindquist(ComputeADMGlobalsOnly = True)': ['alphaCart', 'betaCartU', 'BCartU', 'gammaCartDD', 'KCartDD'],
                            'BrillLindquist(ComputeADMGlobalsOnly = False)': ['alphaCart', 'betaCartU', 'BCartU', 'gammaCartDD', 'KCartDD']}

logging_level = 'DEBUG'

initialization_string_dict = {'BrillLindquist(ComputeADMGlobalsOnly = True)': 'print("example")\nprint("Hello world!")'}

create_test(module, module_name, function_and_global_dict, logging_level=logging_level, initialization_string_dict=initialization_string_dict)

Both instances will be called separately, with their own globals. The print statements will only be called in the first function, since there is no associated initialization_string for the second function as well.

An important note when using create_test is that all arguments are strings. This includes the module, module_name, function, each global in the list of globals, logging level, and initialization_string. The reason for making these fields strings is that when setting module_name, for example, there doesn't exist anything in Python with the name BrillLindquist. So, we wrap it in a string. This is true of every input. Be careful with the dicts and lists, however: their arguments are strings, they aren't themselves strings.

So what does this funciton actually do at a lower level? First, create_test makes sure that all user arguments are of the correct type, failing if any are incorrect.

It then loops through every function in function_and_global_dict, and creates file_string, a string that represents a unit test file that will be executed. Along with the current function comes its global list, and initialization string if it has one.

Next, the contents from run_test is copied into file_string so that the test will be automatically run, and so the user can easily see their unique file contents in case of an error.

A file is then created which houses the file_string, and is what gets called with a bash script. The file exists in the directory of the test being run, and is deleted upon test success, but left for the user to inspect upon test failure. file_string has code to call setup_trusted_values_dict, initialize RepeatedTimer, and runs run_test in order to ensure that the test will run correctly.

Finally, the test file is called using cmdline_helper, which runs the test and does everything described in run_test.

Once run_test finishes, either the test failed or the test succeeded. file_string has additonal code to create a success.txt file if the test passes, and do nothing otherwise. So, create_test looks for the existence of success.txt. If it exists, we delete it and create_test has finished for the current function and moves on to the next function. Otherwise, if success.txt doesn't exist, an exception is thrown to be caught later.

`setup_trusted_values_dict` [Back to top]¶

$$\label{setup_trusted_values_dict}$$

setup_trusted_values_dict takes in a path to a test directory path, and checks whether or not a trusted_values_dict.py exists in the test directory. If it does exist, the function does nothing. If it doesn't exist, setup_trusted_values_dict creates the file trusted_values_dict.py in the test directory. In then writes the following default code into the file:

from mpmath import mpf, mp, mpc
from UnitTesting.standard_constants import precision

mp.dps = precision
trusted_values_dict = {}

The default code allows the unit test to properly interact with and write to the file.

`RepeatedTimer` [Back to top]¶

$$\label{RepeatedTimer}$$

RepeatedTimer is a class that allows the user to automatically run some function every n seconds. We specifically use the class for timed output, where we print every five minutes in order to prevent time-outs in Travis CI; if Travis CI doesn't receive any output for too long, the build automatically fails. Due to some NRPy modules taking a long time due to their sheer size, we want to ensure that the build doesn't time-out simply due to it taking a long time.

`run_test` [Back to top]¶

$$\label{run_test}$$

run_test acts as the hub for an individual unit test. It takes in all the module-wide information, and goes through all the steps of determining whether the test passed or failed by calling many sub-functions. A fundamentally important part of run_test is the notion of self; self stores a test's information (i.e. module, module_name, etc.) to be able to easily pass information to and make assertions in sub-functions. When self is referenced, simply think "information storage".

run_test begins by importing the trusted_values_dict of the current module being tested; since setup_trusted_values_dict is called before run_test, we know it exists.

run_test then determines if the current function/module is being done for the first time based off the existence of the proper entry in trusted_values_dict, and stores this boolean in first_time.

evaluate_globals is then run in order to generate the SymPy expressions for each global being tested.

Next, cse_simplify_and_evaluate_sympy_expressions is called to turn each SymPy expression for each global into a random, yet predictable/repeatable, number.

The next step depends on the value of first_time: if first_time is True, then first_time_print is run to print the result both to the console and to the trusted_values_dict.py. Otherwise, if first_time is False, calc_error is called in order to compare the calculated values and the trusted values for the current module/function. If an error was found, the difference is printed and the code exits. Otherwise, the module completes and returns.

On it's own, run_test doesn't do much -- it's the subfunctions called by run_test that do the heavy lifting in terms of formatting, printing, calculating, etc.

`evaluate_globals` [Back to top]¶

$$\label{evaluate_globals}$$

evaluate_globals takes in self and uses the following attributes:

self.module
self.module_name
self.initialization_string
self.function
self.global_list

evaluate_globals first imports self.module as a module object, instead of a simple string. It next runs self.initialization_string, then creates string of execution string_exec to be called; this string of execution calls self.function on self.module, then gets the SymPy expressions for all globals defined in self.global_list and returns a dictionary containing all the globals and their expressions.

`cse_simplify_and_evaluate_sympy_expressions` [Back to top]¶

$$\label{cse_simplify_and_evaluate_sympy_expressions}$$

cse_simplify_and_evaluate_sympy_expressions takes in self and uses the following attributes:

self.variable-dict

cse_simplify_and_evaluate_sympy_expressions uses SymPy's cse algorithm to efficiently calculate a value for each expression by assigning a random, yet predictable and consistent, number to each variable in the variable dictionary.

cse_simplify_and_evaluate_sympy_expressions first calls expand_variable_dict on self.variable_dict to expand the variable dictionary. This makes it much easier for the user to figure out which variables differ, if any. Basic example:

variable_dict = {'betaU': [0, 10, 400]}
expand_variable_dict(variable_dict) --> {'betaU[0]': 0, 'betaU[1]: 10, 'betaU[2]': 400}

This may seem trivial and unnecessary for a small example, but once the tensors get to a higher rank, such as GammahatUDDdD, which is rank 5 by NRPy naming convention, it's easy to see why the expansion is necessary for user clarity.

Next, cse_simplify_and_evaluate_sympy_expressions loops through each variable in the expanded variable dict and adds that variable's free symbols to a set containing all free symbols from all variables. 'Free symbols' are simply the SymPy symbols that make up a variable.

Once we get this set of free symbols, we know we have every symbol that will be referenced when substituting in values. So, we assign each symbol in this set to a random value. To ensure that this remains predictable and consistent, we do the following:

Create a string representation of the symbol
Use Python's hashlib module to hash the string
Turn the hash value into a hex number
Turn the hex number into an decimal number
Pass the decimal number into Python's random module as the seed
Assign the next random number in the given seed to the symbol

This method ensures a symbol will always be assigned the same random value throughout Python instances, operating systems, and Python versions. If the symbol is M_PI or M_SQRT1_2, however, we want to assign them to their exact values; they should be given their true values of pi and 1/sqrt(2), respectively.

cse_simplify_and_evaluate_sympy_expressions then loops through the expanded variable dictionary in order to calculate a value for each variable. In order to optimize the calculation for massive expressions, we use SymPy's cse (common subexpression elimination) algorithm to greatly improve the speed of calculation. What this algorithm does in essence is factor out common subexpressions (i.e. a**2) by storing them in their own variables (i.e. x0 = a**2) so that any time a**2 shows up, we don't have to recalculate the value of the subexpression; we instead just replace it with x0. This is a seemingly small optimization, but as NRPy expressions can become massive, the small efficiencies add up and make the calculation orders of magnitude faster.

Once the cse algorithm optimizes the calculation for a given variable, we now calculate a numerical result for the variable using calculate_value and store it in a calculated dictionary. For numerical results near zero, we double check the calculation at a higher precision to see if the value should truly be zero.

Finally, after repeating this process for each variable, we return the calculated dictionary which stores numerical values for each variable.

`expand_variable_dict` [Back to top]¶

$$\label{expand_variable_dict}$$

expand_variable_dict takes in a variable dictionary variable_dict , and returns an expanded variable dictionary. 'expanded' refers to taking all tensors in variable_dict and breaking them down into their subcomponents according to the indices of the tensor. This is best illustrated with an example:

variable_dict = {'alpha': 0, 'betaU': [3.14, 1, -42]}
expand_variable_dict(variable_dict) --> {'alpha': 0, 'betaU[0]': 3.14, 'betaU[1]': 1, 'betaU[2]' = -42}

variable_dict = {'aDD': [[1, 2, 3], [4, 5, 6], [7, 8, 9]]}
expanded_variable_dict(variable_dict) --> {'aDD[0][0]': 1, 'aDD[0][1]': 2, 'aDD[0][2]': 3, 'aDD[1][0]': 4, 'aDD[1][1]': 5, 'aDD[1][2]': 6, 'aDD[2][0]': 7, 'aDD[2][1]': 8, 'aDD[2][2]': 9}

As we can see, expand_variable_dict breaks up tensors into their indices. While this may not be obviously useful for low-rank tensors, high-rank tensors become overwhelmingly difficult to debug, and as such their expansion greatly eases the debugging process.

expand_variable_dict starts out by looping through each variable in variable_dict. For each variable, it first gets the dimension of the variable's expression list using get_variable_dimension -- this is equivalent to getting the rank of the tensor.

expand_variable_dict then initializes a counter of the correct dimension; the counter represents the current index of the expression list.

Next, expand_variable_dict flattens the expression list into a rank-1 tensor using flatten; this makes it simple to iterate through.

The flattened expression list is then iterated through, and each expression in the expression list is stored in a result dictionary result_dict. To get the key for the entry into result_dict, we pass the current variable and the counter into 'form_string', which gives us a string representation of an index of a variable. The value for the respective key is simply the current expression.

The counter is then incremented using increment_counter to ensure a consistent naming scheme.

This process is repeated for all variables, and finally result_dict is returned.

`get_variable_dimension` [Back to top]¶

$$\label{get_variable_dimension}$$

get_variable_dimension takes in a tensor tensor and returns a tuple containing the rank of tensor and the dimension of tensor. It does this by looping through the first element of the tensor until it's rank-0. Then the number of times it had to loop is the rank of the tensor. The dimension of the tensor is simply the length of the rank-1 tensor. Example:

tensor = 42
get_variable_dimension(tensor) --> 0, 1

tensor = [1, 2, 3, 4, 5]
get_variable_dimension(tensor) --> 1, 5

tensor = [[1, 2], [3, 4]] --> 2, 2
get_variable_dimension(tensor) --> 2, 2

get_variable_dimension assumes that tensor has the same dimension at all ranks.

This means it's square, or it's dimension could be written N x N x ... x N.

`flatten` [Back to top]¶

$$\label{flatten}$$

flatten takes in a list l and an optional argument the represents already flattened list fl. The user need not pay attention to fl: it's simply used for the recursive call, so the user should use flatten with only l as an argument. It 'unwraps' each element of l by recursively calling flatten while the current index of the current element is of type list. Once it's not of type list, append it to fl, and once every element has been flattened, return fl. Example:

l = [1, 2, 3]
flatten(l) --> [1, 2, 3]

l = [[1, 2], [3, 4]]
flatten(l) --> [1, 2, 3, 4]

l = [[[[[2], 3], 4, [5, 6], 7]], 8]
flatten(l) --> [2, 3, 4, 5, 6, 7, 8]

`form_string` [Back to top]¶

$$\label{form_string}$$

form_string takes in a variable var and a counter counter and returns the string representation of the counter'th index of var. Example:

var = 'alpha'
counter = [0, 0]
form_string(var, counter) --> 'alpha[0][0]'

var = 'gammauDDdD'
counter = [2, 3, 4, 1, 0]
form_string(var_counter) --> 'gammauDDdD[2][3][4][1][0]'

`increment_counter` [Back to top]¶

$$\label{increment_counter}$$

increment_counter takes in a counter counter and a length length and returns a new counter which is equivalent to (counter + 1) in base length.

increment_counter loops through counter in reverse, adds 1 to the last digit, and then checks if any digit is now equal to length -- if it is, increment the next digit as well, and so on.

It then returns the un-reversed new counter which represents an iteration of counter.

Example:

counter = [0, 0]
length = 2
increment_counter(counter, length) -> [0, 1]

counter = [0, 1]
length = 2
increment_counter(counter, length) -> [1, 0]

counter = [2, 3, 4, 4, 4]
length = 5
increment_counter(counter, length) -> [2, 4, 0, 0, 0]

`calculate_value` [Back to top]¶

$$\label{calculate_value}$$

calculate_value takes in a dictionary of free symbols free_symbols_dict, the outputs from SymPy's cse algorithm replaced and reduced, and an optional argument precision_factor which is 1 by default.

calculate_value first sets reduced to reduced[0] -- this is to remove extraneous output from the cse algorithm. It then sets the precision to standard_constants.precision * precision_factor. This is why precision_factor is optional and defaults to 1 -- most of the time we want the standard precision.

calculate_value next loops through replaced. replaced is a dictionary whose keys are new expressions and whose values are old expressions; in essence, we set the new expression to the value we get by calculating the old expression. This is equivalent to calculating numerical values for all the cse optimizations instead of expressions.

Now that we have values for all the new expressions, we can substitute these into the overall expression given to us by reduced. After doing this, we finally have a value for our expression that we arrived at as optimally as possible.

Finally, we store this value as an mpf -- if it is actually complex, we store it as a mpc. The precision is then set back to standard_constants.precision before the final value is returned.

`create_dict_string` [Back to top]¶

$$\label{create_dict_string}$$

create_dict_string takes in a dictonary of values and returns a well-formatted string representation of the dictionary. It ensures that a consistent, repeatable output is returned from two identical dictionaries.

create_dict_string sorts the dictionary of values, loops through each (key, value) pair of the dictionary, and creates a string based on the type of the value (mpf, mpc, or other). By doing this, it creates a string representation of the dictionary of values that can be printed to a trusted_values-dict.

`first_time_print` [Back to top]¶

$$\label{first_time_print}$$

first_time_print takes in self and a boolean, write, and uses the following attributes:

self.module_name
self.trusted_values_dict_name
self.calculated_dict
self.path

first_time_print prints the string that should be copied into your trusted_values_dict to the console. If write is True, it additionally automatically appends the string to the trusted_values_dict.py.

`calc_error` [Back to top]¶

$$\label{calc_error}$$

calc_error takes in self and uses the following attributes:

self.calculated_dict
self.trusted_values_dict_entry
self.module_name

calc_error loops through each variable in self.calculated_dict and self.trusted_values_dict_entry, and compares the variables' values to ensure that no variable differs.

calc_error first makes sure that self.calculated_dict and self.trusted_values_dict_entry have the same entries; that is, they contain values for the same variables. If any variables differ, those variables are printed and the function returns False.

If the dictionaries have the same entries, we must compare the two values for a given variable. We compare the values by seeing by how many decimal places the values differ; if they differ by more than (standard_constants.precision / 2) decimal places, then we consider the values to be different. The reason for this method of comparison is the inconsistency of floating point calculations; if we simply checked for equality, we'd consistently run into error when in reality the values differed by a minimal amount -- say 2 * 10 ** -30 -- which is close enough to have arisen from the same calculation.

We repeat this process for each variable, storing the variables who differed in a list. After each variable has been checked, we see if the list of differing variables is empty. If it's empty, no variables differed -- so the test passed -- and we return True. Otherwise, if it's not empty, we print the contents of the list -- the variables that differed. We additionally print a new trusted_values_dict entry (similar to first_time_print that the user can look at and determine if it's correct or not. The function then returns False, as the test failed.

Unit Testing Functions Reference¶

Author: Kevin Lituchy¶

Introduction:¶

Table of Contents¶

Non-interactive Files [Back to top]¶

failed_tests [Back to top]¶

standard_constants [Back to top]¶

run_NRPy_UnitTests [Back to top]¶

create_test [Back to top]¶

setup_trusted_values_dict [Back to top]¶

RepeatedTimer [Back to top]¶

run_test [Back to top]¶

evaluate_globals [Back to top]¶

cse_simplify_and_evaluate_sympy_expressions [Back to top]¶

expand_variable_dict [Back to top]¶

get_variable_dimension [Back to top]¶

flatten [Back to top]¶

form_string [Back to top]¶

increment_counter [Back to top]¶

calculate_value [Back to top]¶

create_dict_string [Back to top]¶

first_time_print [Back to top]¶

calc_error [Back to top]¶

`Non-interactive Files` [Back to top]¶

`failed_tests` [Back to top]¶

`standard_constants` [Back to top]¶

`run_NRPy_UnitTests` [Back to top]¶

`create_test` [Back to top]¶

`setup_trusted_values_dict` [Back to top]¶

`RepeatedTimer` [Back to top]¶

`run_test` [Back to top]¶

`evaluate_globals` [Back to top]¶

`cse_simplify_and_evaluate_sympy_expressions` [Back to top]¶

`expand_variable_dict` [Back to top]¶

`get_variable_dimension` [Back to top]¶

`flatten` [Back to top]¶

`form_string` [Back to top]¶

`increment_counter` [Back to top]¶

`calculate_value` [Back to top]¶

`create_dict_string` [Back to top]¶

`first_time_print` [Back to top]¶

`calc_error` [Back to top]¶