# WARNING: advised to install a specific version, e.g. tensorwaves==0.1.2
%pip install -q tensorwaves[doc,jax,pwa,viz] IPython
import os
STATIC_WEB_PAGE = {"EXECUTE_NB", "READTHEDOCS"}.intersection(os.environ)
{autolink-concat}
One of the {class}.Estimator
s in the {mod}.estimator
module is {class}.ChiSquared
. This estimator is useful if you have a set of $n$ measured values $\mathbf{y}=\left\{y_1, y_2, \dots, y_n\right\}$ for $m$-dimensional data points $\mathbf{x}=\left\{x_{j,1}, x_{j,2}, \dots, x_{j,n}\right\}$, with $j\in\left\{1, \dots, m\right\}$.
To illustrate how to fit an expression to some data sample $\mathbf{x},\mathbf{y}$, we'll generate 'observed' values $\mathbf{y}$ for the following 1-dimensional polynomial.
import sympy as sp
a, b, c, x = sp.symbols("a b c x")
expression = a + b * x + c * x**2
expression
From this expression, we create a {class}.ParametrizedFunction
, where the symbols $a,b,c$ are interpreted as its parameters. The values here are chosen arbitrarily to generate the $\mathbf{y}$-values in the next step.
from tensorwaves.function.sympy import create_parametrized_function
function = create_parametrized_function(
expression,
parameters={a: 17, b: -2, c: -0.8},
backend="jax",
)
Next, we uniformly generate data points $\mathbf{x}$ a line segment. The corresponding values for $\mathbf{y}$ are computed with the above polynomial function and smeared with a small, normally distributed offset.
import numpy as np
def smear_gaussian(array, sigma, rng):
return array + rng.normal(scale=sigma, size=len(array))
rng = np.random.default_rng(seed=0)
sample_size = 500
x_min, x_max = -5, +5
x_values = rng.uniform(x_min, x_max, sample_size)
data = {"x": x_values}
observed_y = function(data)
observed_y = smear_gaussian(observed_y, sigma=5, rng=rng)
To make the fit a bit more interesting, we give the {attr}~.ParametrizedFunction.parameters
a different value than the ones we used to generate the $\mathbf{y}$-values with.
original_parameters = function.parameters
initial_parameters = {"a": -25, "b": 1.5, "c": 2.6}
function.update_parameters(initial_parameters)
%config InlineBackend.figure_formats = ['svg']
import matplotlib.pyplot as plt
def compare_model(function, x_values, observed_y):
_, ax = plt.subplots(figsize=(8, 4))
linear_domain = {"x": np.linspace(x_min, x_max, 100)}
ax.plot(
linear_domain["x"],
function(linear_domain),
c="red",
linewidth=3,
label="initial fit model",
)
ax.scatter(x_values, observed_y, s=2, label="generated data")
ax.set_xlabel("$x$")
ax.set_ylabel("$y$")
ax.set_ylim([-30, 50])
ax.legend(loc="upper left")
plt.show()
compare_model(function, x_values, observed_y)
Finally, we construct a {class}.ChiSquared
estimator and use it to optimize the {class}.ParametrizedFunction
with regard to the $\mathbf{x},\mathbf{y}$ data points.
from tensorwaves.estimator import ChiSquared
from tensorwaves.optimizer import Minuit2
estimator = ChiSquared(function, data, observed_y, backend="jax")
optimizer = Minuit2()
fit_result = optimizer.optimize(estimator, initial_parameters)
fit_result
assert fit_result.minimum_valid
The optimized parameters in the {class}.FitResult
are indeed comparable to the original parameter values with which the data was generated:
original_parameters
compare_model(function, x_values, observed_y)