#!/usr/bin/env python # coding: utf-8 # # Uncertainty propagation # # # This notebook is an element of the free [risk-engineering.org courseware](https://risk-engineering.org/). It can be distributed under the terms of the [Creative Commons Attribution-ShareAlike licence](https://creativecommons.org/licenses/by-sa/4.0/). # # Author: Eric Marsden # # --- # # This notebook contains an introduction to use of Python and NumPy for uncertainty propagation. It compares the use of the [Python uncertainties library](https://pythonhosted.org/uncertainties/) with a stochastic simulation. See the [associated course materials](https://risk-engineering.org/uncertainty-concepts/) for background information and to download this content as a Jupyter/Python notebook. # The *body mass index* (BMI) is the ratio $\frac{\text{body mass } (kg)}{\text{body height} (m)^2}$. It is often used as an (imperfect) indicator or obesity or malnutrition. # # **Task**: calculate your BMI and the associated uncertainty interval, assuming that: # # - your weight scale tells you that you weigh 84 kg (precision shown to the nearest kilogram) # # - a tape measure says you are between 181 and 182 cm tall (most likely value is 181.5 cm) # ## Solution using the uncertainties library # The `uncertainties` library implements [**linear error propagation theory**](https://en.wikipedia.org/wiki/Propagation_of_uncertainty) in Python. The uncertainties library can be installed (if you’re using a Linux machine) using a command such as # # > sudo pip3 install uncertainties # # If you’re using Google CoLaboratory, execute a code cell that contains # # > !pip install uncertainties # # On a Microsoft Windows installation with Anaconda, open the "Anaconda Powershell Prompt" from the start menu, and execute # # > pip install uncertainties # In[1]: import uncertainties x = uncertainties.ufloat(10, 0.5) x # This defines an uncertain number (a random variable) with a central value of 10 and a standard deviation of 0.5. The standard deviation represents the uncertainty (which in this context you can think of as the error), in our number. The uncertainties package is able to do various types of arithmetic and other mathematical operations on these uncertain numbers, and propagates the uncertainty to the result. For example, we can multiply this uncertain number by two: # In[2]: 2 * x # Note that the uncertainty range concerning our value has been multiplied by two, along with the value itself. # # We can add the uncertain number to itself: # In[3]: x + x + x # We can also add a very uncertain number to `x`: # In[4]: y = uncertainties.ufloat(1, 10) x + y # or we can add a certain number (a number with zero uncertainty) to `x`: # In[5]: x + 5 # Note that the uncertainties package is able to determine that the error range in the expression below is zero (this is because the subtracted quantities are **correlated** to the first value). # In[6]: 2 * x - x - x # The uncertainties package is also able to propagate uncertainty through more complicated mathematical expressions: # In[7]: import uncertainties.umath x**2 + 45*uncertainties.umath.sin(x + y) # You can use it as a convenient calculator for mathematical expressions including uncertainty. The library is also useful to add uncertainty propagation to existing Python code, which you can often reuse without any modifications. # # Coming back to our initial objective, recall that we want to calculate the **body mass index** (BMI) of a person, using the formula given above, from uncertain measurements of the person’s mass and height. The description of measurements given in the introduction indicates that: # # - weight is a uniform probability distribution between 83.5 and 84.5 kg # # - height is a triangular probability distribution between 1.81 and 1.82 m with a central value of 1.815. # # To create appropriate random variables using the uncertainties package, we need to know their central value and standard deviation. We can calculate the standard deviations using `scipy.stats`. # In[8]: import scipy.stats # this is for the weight random variable (a uniform distribution) weight_stdev = scipy.stats.uniform(83.5, 1).std() weight_stdev # In[9]: # this is for the height random variable (a triangular distribution) height_stdev = scipy.stats.triang(loc=1.81, scale=0.01, c=0.5).std() height_stdev # In[10]: weight = uncertainties.ufloat(84, weight_stdev) height = uncertainties.ufloat(1.815, height_stdev) BMI = weight/(height**2) print("{:P}".format(BMI)) # ## Solution using a Monte Carlo simulation # We can also estimate the body mass index using a Monte Carlo (stochastic simulation) method. Our [slides on Monte Carlo methods for risk analysis](https://risk-engineering.org/monte-carlo-methods/) present some background material on this method. # In[11]: import numpy from numpy.random import * import matplotlib.pyplot as plt plt.style.use("bmh") get_ipython().run_line_magic('config', 'InlineBackend.figure_formats=["svg"]') # When undertaking statistical analyses, it can be important for your results to be *reproducible*, meaning that you obtain the same results each time you run the analysis. The pseudorandom number generator used by NumPy generates long sequences of outputs that are fully determined by the *seed* of the generator (if you use the same seed, you will obtain the same results each time you run the notebook). In default setting, the seed is taken from some unpredictable feature of the computer environment (such as the `/dev/random` device on a Linux/Unix computer), so sucessive runs will generate different results. If you want a reproducible notebook, you can set the random seed explicitly. # In[12]: numpy.random.seed(42) # In[13]: # N is the number of runs in our stochastic simulation N = 10_000 def BMI() -> float: return uniform(83.5, 84.5) / triangular(1.81, 1.815, 1.82)**2 sim = numpy.zeros(N) for i in range(N): sim[i] = BMI() print("{} ± {}".format(sim.mean(), sim.std())) plt.hist(sim, alpha=0.5); # Indeed, the result from the stochastic simulation is very close to that obtained using the uncertainties package. # ## Limitations # # Note: the uncertainties package uses linear error propagation, and is only appropriate when the level of uncertainty is small (it ignores any non-linear terms in the derivatives of each operation in your calculation, which are used to propagate the uncertainties analytically). It also assumes that the uncertainty (or error) in your data follows a normal distribution, which is a good assumption when dealing with measurement errors for example, but is not appropriate in all cases. In particular, it won’t be suitable if your uncertainty is asymmetric, or if you need to truncate the uncertainty in some way (for example if you know that your measurements are definitely positive). The documentation provides more information on [the underlying assumptions](https://pythonhosted.org/uncertainties/tech_guide.html). # # Two related but more complex libraries are [soerp](https://pypi.python.org/pypi/soerp) which implements second-order error propagation and [mcerp](https://pypi.python.org/pypi/mcerp) which implements Monte Carlo error propagation. # In[ ]: