iminuit: Past and Future¶

Hans Dembinski, TU Dortmund, Germany

PyPI	https://pypi.org/project/iminuit
Code	https://github.com/scikit-hep/iminuit
Documentation	http://iminuit.readthedocs.org
Gitter	https://gitter.im/Scikit-HEP/community

Introduction to iminuit¶

iminuit is a frontend to Minuit¶

Minuit: General purpose tool to minimize statistical cost functions with box constraints and to compute parameter uncertainties (fairly unique to Minuit)
Robust technology from 1975 that stood the tests of time
Python frontend to MINUIT2 C++ code used by ROOT
- Uses original ROOT code
- MINUIT is driving backend in virtually every HEP analysis

Minimization: Migrad algorithm¶

Newton-steps with along-the-way updated hessian matrix
Fallback to steepest descent with line search
Various heuristics to deal with sticky situations
Box-constraints implemented via parameter transformations
Stopping criterion: Estimated distance to minimum (EDM), very well suited to statistics problems

Uncertainty computation: Hesse and Minos algorithms¶

Hesse algorithm, 1D intervals and ND contours
Minos algorithm, 1D intervals and ND contours
Also respect box-constraints
Different pros and cons, Hesse recommended as default (more later)

A bit of (version) history¶

Fortran¶

Original MINUIT in Fortran (1975)

C/C++¶

TMinuit: Original ported to C and integrated into ROOT (~2000)
Minuit2 (aka SEAL Minuit)
- Ambitious rewrite of MINUIT in C++ (2002-2006)
- Originally standalone project, now integrated into ROOT

Python¶

pyminuit, pyminuit2
- Original Minuit Python wrappers by Jim Pivarski
- Feature highlight: auto-detection of parameter names in functions
iminuit
- Rewrite using Cython by Piti Ongmongkolkul
- Interface-compatible with pyminuit
- i for interactive: Pretty display of fit status in Jupyter notebooks
- Intermediate maintainer Christoph Deil (leading developer of gammapy)
- HD took over as maintainer in 2018
- iminuit joined Scikit-HEP
- Version 1.3
  - Tighter intergration with numpy
    - Support functions that accept parameters as numpy arrays
    - Access to parameters and covariance matrix as numpy arrays
  - Support for modifying fit state between fitting steps (key feature of MINUIT)
    - Fix and release parameters
    - Change parameter values
  - Better code sharing with ROOT C++ (credit: Henry Schreiner, GooFit)
  - Prebuild Python wheels and conda packages for every release (credit: Henry Schreiner, Chris Burr, Matt Craig, Brigitta Sipőcz, Matthew Becker)
  - Added lots and lots of documentation and tutorials
    - Try iminuit tutorials on Binder (credit: Matthew Feickert)
  - Lots of bug-fixes in the interface
  - Allow computing "Hesse" uncertainties without prior call to Migrad
- Version 1.4 (current version 1.4.7)
  - Further pythonization of interface
    - Duplicated and legacy interface deprecated
    - Replaced getter/setters with properties
    - Make similar objects work alike
    - Comply with Python/Numpy expectations (e.g. dict-like objects support dict interface)
    - Slicing and broadcasting syntax for Minuit.values, Minuit.errors, Minuit.fixed (more later)
    - Increasing consistency pays off
      - Reduces mental load
      - Makes iminuit easier to teach and learn
  - PDG-rule rounding for values with errors
  - Cost functions (more later)
  - C++ Minuit2 updated to ROOT v6.23-01 (fixes some long-standing minor cornercase bugs)

Deprecated interface¶

Minuit.list_of_fixed_params(), Minuit.list_of_vary_params(): use Minuit.fixed
Minuit.migrad_ok(): use Minuit.valid
Minuit.matrix_accurate(): use Minuit.accurate
Minuit.get_fmin(): use Minuit.fmin
Minuit.get_param_states(): use Minuit.param
Minuit.get_initial_param_states(): use Minuit.init_param
Minuit.get_num_call_fcn(): use Minuit.ncalls_total
Minuit.get_num_call_grad(): use Minuit.ngrads_total
Minuit.hesse(maxcall=…) keyword: use ncall=… like in Minuit.migrad()
Minuit.edm: use Minuit.fmin.edm
Minuit.is_fixed: replaced by .fixed attribute
Minuit.set_strategy: assign to Minuit.strategy instead
Minuit.set_errordef: assign to Minuit.errordef instead
Minuit.set_print_level: assign to Minuit.print_level instead
Minuit.print_param_states, Minuit.print_initial_param_states, Minuit.print_fmin, Minuit.print_matrix, Minuit.print_param, Minuit.print_initial_param, Minuit.print_all_minos: use print() on the respective attributes instead

Why use iminuit and not X (X = RooFit, pyhf, zfit ...)?¶

If X works for you, good! No reason to change from X to iminuit
Learning iminuit is simple and it is universally applicable
Useful in teaching courses on fitting
Good enough for simple fits
Good for experts and power users who need full control over all aspects of the fit

VS

iminuit and a simple fit¶

A quick demo how to do a simple least-squares fit with iminuit.

In [1]:

import numpy as np
from matplotlib import pyplot as plt
from iminuit import Minuit

In [2]:

# make 10 data points with scatter on y-coordinate

rng = np.random.default_rng(1)

x = np.linspace(0, 1, 10)
ye = np.ones_like(x) * 0.2
y = rng.normal(2 * x + 1, ye, size=len(x))

plt.errorbar(x, y, ye, fmt="o")
plt.xlabel("x")
plt.ylabel("y");

To recover the parameters of the line, we need to write a model and a cost function. The model predicts a y value for an x value, using a set of parameters. The cost function must compute some kind of distance between the model predictions and the observations.

Optimal for this case is the least-squares method, where the cost function is the sum of squared studentized residuals.

In [3]:

# line model
def model(x, a, b):
    return a + b * x

def cost(a, b):
    ym = model(x, a, b)
    res = (y - ym) / ye # studentized residuals
    return np.sum(res ** 2)

# initialize Minuit object by passing the cost function
# - Minuit uses a local minimizer, we need to set starting values
# - Minuit also needs an `errordef` parameter (more later)
m = Minuit(cost, a=0, b=0, errordef=1)
m.params

Out[3]:

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	a	0.0	0.1
1	b	0.0	0.1

Note: Minuit detected that the cost function has two parameters named a and b.

The Minuit object represents the current state of the fit. The initial state consists of the starting values and some step sizes (here 0.1) for the numerical gradient computation. You can set the step sizes yourself, with error_a=... etc, but a good default is selected for you if you do not.

Minuit makes it easy to interact with the fitter, which is useful for debugging and to manually help the minimizer to find the minimum. It also presents useful diagnostics that indicate if something went wrong.

In [4]:

# calling Migrad minimizer 
m.migrad()

Out[4]:

FCN = 3.959		Ncalls = 34 (34 total)
EDM = 1.09e-22 (Goal: 0.0002)		up = 1.0
Valid Min.	Valid Param.	Above EDM	Reached call limit
True	True	False	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	True	True	False

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	a	1.05	0.12
1	b	1.99	0.20

Values and errors are rounded according to the PDG rounding rules.

The "Hesse Errors" now show estimated uncertainties of the parameters. Migrad computes these mostly for free as part of its minimization process.

Things to note:

EDM: estimated distance to minimum, must be smaller than "Goal"; shows that Migrad converged
Ncalls: How many function calls were used by Migrad (and the total by this Minuit object so far)
Pos. def.: Whether the Hessian matrix (matrix of second derivatives) is positive definite; it must be for a valid minimum

You can get the values from the Minuit.fmin attribute.

In [5]:

plt.errorbar(x, y, ye, fmt="o")
plt.xlabel("x")
plt.ylabel("y")
plt.plot(x, model(x, *m.values[:]), color="k"); # trick to access values as a tuple

In [6]:

print(repr(m.fmin)) # repr() disables the usual pretty printing

FMin(fval=3.9594362732650183, edm=1.0859678150194733e-22, tolerance=0.1, nfcn=34, ncalls=34, up=1.0, is_valid=True, has_valid_parameters=True, has_accurate_covar=True, has_posdef_covar=True, has_made_posdef_covar=False, hesse_failed=False, has_covariance=True, is_above_max_edm=False, has_reached_call_limit=False)

"Minos Errors" are not automatically computed. You need to run Minos explicitly.

In [7]:

m.minos()

Out[7]:

	a		b
Error	-0.12	0.12	-0.2	0.2
Valid	True	True	True	True
At Limit	False	False	False	False
Max FCN	False	False	False	False
New Min	False	False	False	False

In such a simple fit, the Minos and Hesse uncertainty estimates are equal, but in non-parabolic cost functions they are not.

The Minos and Hesse uncertaintes are now nicely comparable in the parameter display:

In [8]:

m.params

Out[8]:

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	a	1.05	0.12	-0.12	0.12
1	b	1.99	0.20	-0.20	0.20

The Minos uncertainties can be read-out with the following attribute (read-only).

In [9]:

a_me = m.merrors["a"] # returns a data struct with many fields
print(f"Did Minos work for parameter {a_me.name}? {a_me.is_valid}\n"
      f"lower error={a_me.lower:.3} upper error={a_me.upper:.3}")

Did Minos work for parameter a? True
lower error=-0.118 upper error=0.118

Much more information can be pulled from this object. Try dir(a_me).

Interactively manipulate fit state¶

Let's mess around. Let's say we know that a = 1, what is the value of b then? We can fix a and set its value, and then run Migrad again.

We can manipulate Minuit with the attributes m.values, m.fixed, and m.errors (to set different step sizes).

In [10]:

m.values["a"] = 1
m.fixed["a"] = True
m.migrad()

Out[10]:

FCN = 4.127		Ncalls = 8 (66 total)
EDM = 1.51e-20 (Goal: 0.0002)		up = 1.0
Valid Min.	Valid Param.	Above EDM	Reached call limit
True	True	False	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	True	True	False

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	a	1.00	0.12	-0.12	0.12			yes
1	b	2.06	0.11	-0.20	0.20

This took Minuit only 8 calls, because it was already close to the minimum and because it could reuse information from the previous minimization. This is another key feature of Minuit and iminuit.

The value and error of b changed a bit. Let's see how b varies when we scan over a.

In [11]:

a = np.linspace(0.5, 1.5, 20)
b = []
be = []
fval = []
for ai in a:
    m.values["a"] = ai
    m.migrad()
    assert m.valid # assert that fit is valid
    fval.append(m.fval) # get value of cost function
    b.append(m.values["b"])
    be.append(m.errors["b"])

# values and uncertainty of b as a function of a, also shown is the function minimum
plt.errorbar(a, b, be, fmt="o", color="C0")
plt.twinx()
plt.plot(a, fval, color="C1");

The uncertainty estimate does not change, because the cost function has a constant second derivative, and Minuit uses the second derivative by default to estimate the uncertainty ("Hesse method").

Let's see how the corresponding lines look like.

In [12]:

plt.errorbar(x, y, ye, fmt="o")
for i, (ai, bi) in enumerate(zip(a, b)):
    plt.plot(x, model(x, ai, bi), color=f"{i / 30.}")

Mechanical analog: stiff rod attached to the data points with springs of constant tension. If both a and b are minimized, Migrad finds the minimum energy configuration. This is also true, if a is externally constrained to other values (fixed during minimization).

Cost functions¶

Minuit now comes with builtin cost functions, which can be imported from iminuit.cost. Currently available:

Unbinned negative log-likelihood
Extended unbinned negative log-likelihood
Binned negative log-likelihood
Extended binned negative log-likelihood
Least-squares
Robust least-squares

The errordef parameter is automatically set correctly for these.

As a particle physicist, you will often fit some peak over background. Let's do that.

In [13]:

from scipy import stats

rng = np.random.default_rng(1)

x = np.append(stats.norm(0.5, 0.05).rvs(size=500, random_state=rng),
              stats.expon(0.0, 0.5).rvs(size=1000, random_state=rng))

plt.hist(x, bins=30, range=(0, 1));

In [14]:

# extended binned maximum likelihood fit
from iminuit.cost import ExtendedBinnedNLL

n, xe = np.histogram(x, bins=30, range=(0, 1)) # or use boost_histogram here!

# ExtendedBinnedNLL wants a model that returns the expected counts per bin

def signal(xe, ns, mu, sigma):
    return ns * stats.norm(mu, sigma).cdf(xe)

def background(xe, nb, lambd):
    return nb * stats.expon(0.0, lambd).cdf(xe)

def total(xe, ns, mu, sigma, nb, lambd):
    return signal(xe, ns, mu, sigma) + background(xe, nb, lambd)

cost = ExtendedBinnedNLL(n, xe, total)

m = Minuit(cost, ns=1, mu=1, sigma=1, nb=1, lambd=1)
m.migrad()

Out[14]:

FCN = 4351		Ncalls = 106 (117 total)
EDM = 5.04e+03 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
False	True	True	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	False	False	True

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	1.0	0.5
1	mu	1.0	0.4
2	sigma	0.55	0.13
3	nb	1.0	0.2
4	lambd	0.6	0.5

Uh oh, this does not look good. Let's do this again in verbose mode.

In [15]:

cost.verbose = 1
m = Minuit(cost, ns=1, mu=1, sigma=1, nb=1, lambd=1)
m.migrad()

(1.0, 1.0, 1.0, 1.0, 1.0) -> 4674.964736477243
(1.0004721689651892, 1.0, 1.0, 1.0, 1.0) -> 4674.747479989268
(0.9995278310348108, 1.0, 1.0, 1.0, 1.0) -> 4675.182029100815
(1.0037087473289117, 1.0, 1.0, 1.0, 1.0) -> 4673.259222937664
(0.9962912526710882, 1.0, 1.0, 1.0, 1.0) -> 4676.672479454253
(1.0, 1.0004721689651892, 1.0, 1.0, 1.0) -> 4675.082012794044
(1.0, 0.9995278310348108, 1.0, 1.0, 1.0) -> 4674.847538457714
(1.0, 1.0025195417256905, 1.0, 1.0, 1.0) -> 4675.591439464525
(1.0, 0.9974804582743095, 1.0, 1.0, 1.0) -> 4674.340262923246
(1.0, 1.0, 1.0004721689651892, 1.0, 1.0) -> 4675.108461158805
(1.0, 1.0, 0.9995278310348108, 1.0, 1.0) -> 4674.820982318544
(1.0, 1.0, 1.0041063146339098, 1.0, 1.0) -> 4676.2136721107645
(1.0, 1.0, 0.9958936853660902, 1.0, 1.0) -> 4673.713571570834
(1.0, 1.0, 1.0, 1.0004721689651892, 1.0) -> 4674.532834946704
(1.0, 1.0, 1.0, 0.9995278310348108, 1.0) -> 4675.3967755808835
(1.0, 1.0, 1.0, 1.001900763621876, 1.0) -> 4673.2269103671
(1.0, 1.0, 1.0, 0.998099236378124, 1.0) -> 4676.704792024484
(1.0, 1.0, 1.0, 1.0, 1.0004721689651892) -> 4675.230740501793
(1.0, 1.0, 1.0, 1.0, 0.9995278310348108) -> 4674.6986609070855
(1.0, 1.0, 1.0, 1.0, 1.0026357449497056) -> 4676.448710524963
(1.0, 1.0, 1.0, 1.0, 0.9973642550502942) -> 4673.478533005953
(1.0, 1.0, 0.9958936853660902, 1.0, 1.0) -> 4673.713571570834
(1.0, 1.0, 0.9794684268304512, 1.0, 1.0) -> 4668.687788287826
(1.0, 1.0, 0.9384052804913535, 1.0, 1.0) -> 4655.999021917714
(1.0, 1.0, 0.8152158414740606, 1.0, 1.0) -> 4617.747455906111
(1.0, 1.0, 0.4456475244221817, 1.0, 1.0) -> 4580.859682709635
(1.0, 1.0, 0.5137150129086776, 1.0, 1.0) -> 4563.82573900133
(1.0, 1.0, 0.5874484732019645, 1.0, 1.0) -> 4564.126392536502
(1.0, 1.0, 0.5494450164593476, 1.0, 1.0) -> 4562.090551203665
(1.0036637038367755, 1.0, 0.5494450164593476, 1.0, 1.0) -> 4560.200497718858
(0.9963362961632245, 1.0, 0.5494450164593476, 1.0, 1.0) -> 4563.983565611905
(1.0, 1.002488943851375, 0.5494450164593476, 1.0, 1.0) -> 4564.255266810877
(1.0, 0.9975110561486249, 0.5494450164593476, 1.0, 1.0) -> 4559.929472257163
(1.0, 1.0, 0.5535016088329827, 1.0, 1.0) -> 4562.135477405995
(1.0, 1.0, 0.5453884240857125, 1.0, 1.0) -> 4562.091244256899
(1.0, 1.0, 0.5503309029222405, 1.0, 1.0) -> 4562.096515707349
(1.0, 1.0, 0.5485591299964547, 1.0, 1.0) -> 4562.086762188794
(1.0, 1.0, 0.5494450164593476, 1.0018776786774248, 1.0) -> 4560.479344337338
(1.0, 1.0, 0.5494450164593476, 0.9981223213225751, 1.0) -> 4563.703743308258
(1.0, 1.0, 0.5494450164593476, 1.0, 1.0026037414694076) -> 4563.519650839014
(1.0, 1.0, 0.5494450164593476, 1.0, 0.9973962585305924) -> 4560.659130785849
(1.0, 1.0, 0.5494450164593476, 1.0, 0.9973962585305924) -> 4560.659130785849
(1.0, 1.0, 0.5494450164593476, 1.0, 0.9869812926529622) -> 4554.910259878616
(1.0, 1.0, 0.5494450164593476, 1.0, 0.9609438779588867) -> 4540.376251333627
(1.0, 1.0, 0.5494450164593476, 1.0, 0.8828316338766602) -> 4495.413838360195
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4351.069350631338
(1.0, 1.0, 0.5494450164593476, 1.0, -0.054515295110058704) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(1.0030670068127157, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4349.650048427409
(0.9969329931872843, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4352.490416381839
(1.0, 1.001880059906998, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4352.484827719281
(1.0, 0.9981199400930022, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4349.65580502589
(1.0, 1.0, 0.5503101987837744, 1.0, 0.6484949016299804) -> 4351.120385981633
(1.0, 1.0, 0.5485798341349208, 1.0, 0.6484949016299804) -> 4351.019998685358
(1.0, 1.0, 0.5494450164593476, 1.0019196539019368, 0.6484949016299804) -> 4349.320380512302
(1.0, 1.0, 0.5494450164593476, 0.9980803460980631, 0.6484949016299804) -> 4352.82066610135
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6509569048797456) -> 4352.6151864718095
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6460328983802152) -> 4349.524230388504
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6526872958388353) -> 4353.702053535972
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6443025074211254) -> 4348.43872332317
(3.4698604520405016, -0.3768522769030831, 0.5236480900797368, 2.4324800341458888, -4.667074736776749) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4351.069350631338
(1.0030670068127157, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4349.650048427409
(0.9969329931872843, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4352.490416381839
(1.0, 1.001880059906998, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4352.484827719281
(1.0, 0.9981199400930022, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4349.65580502589
(1.0, 1.0, 0.5503101987837744, 1.0, 0.6484949016299804) -> 4351.120385981633
(1.0, 1.0, 0.5485798341349208, 1.0, 0.6484949016299804) -> 4351.019998685358
(1.0, 1.0, 0.5494450164593476, 1.0019196539019368, 0.6484949016299804) -> 4349.320380512302
(1.0, 1.0, 0.5494450164593476, 0.9980803460980631, 0.6484949016299804) -> 4352.82066610135
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6526872958388353) -> 4353.702053535972
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6443025074211254) -> 4348.43872332317
(1.0006134013625432, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4350.7853492625145
(0.9993865986374568, 1.0, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4351.353422541959
(1.0, 1.0003760119813996, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4351.352292332769
(1.0, 0.9996239880186004, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4350.786486189247
(1.0, 1.0, 0.5496180529242329, 1.0, 0.6484949016299804) -> 4351.079423391176
(1.0, 1.0, 0.5492719799944623, 1.0, 0.6484949016299804) -> 4351.059345207472
(1.0, 1.0, 0.5494450164593476, 1.0003839307803875, 0.6484949016299804) -> 4350.719369189276
(1.0, 1.0, 0.5494450164593476, 0.9996160692196127, 0.6484949016299804) -> 4351.4194258873495
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6493333804717514) -> 4351.595735282312
(1.0, 1.0, 0.5494450164593476, 1.0, 0.6476564227882093) -> 4350.543048968641
(1.0030670068127157, 1.001880059906998, 0.5494450164593476, 1.0, 0.6484949016299804) -> 4351.068348023784
(1.0030670068127157, 1.0, 0.5503101987837744, 1.0, 0.6484949016299804) -> 4349.700977099286
(1.0030670068127157, 1.0, 0.5494450164593476, 1.0019196539019368, 0.6484949016299804) -> 4347.90269969041
(1.0030670068127157, 1.0, 0.5494450164593476, 0.9999999999999999, 0.6526872958388353) -> 4352.281183475792
(1.0, 1.001880059906998, 0.5503101987837744, 0.9999999999999999, 0.6484949016299804) -> 4352.531912644887
(1.0, 1.001880059906998, 0.5494450164593476, 1.0019196539019368, 0.6484949016299804) -> 4350.734089393373
(1.0, 1.001880059906998, 0.5494450164593476, 0.9999999999999999, 0.6526872958388353) -> 4355.119901611518
(1.0, 1.0, 0.5503101987837744, 1.0019196539019368, 0.6484949016299804) -> 4349.371481813372
(1.0, 1.0, 0.5503101987837744, 0.9999999999999999, 0.6526872958388353) -> 4353.752422639777
(1.0, 1.0, 0.5494450164593476, 1.0019196539019368, 0.6526872958388353) -> 4351.954059911005
(49.66909317565592, -40.32568678306268, -11.343906753210103, -16.693789048054562, 45.42198533718604) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan
(nan, nan, nan, nan, nan) -> nan

Out[15]:

FCN = 4351		Ncalls = 106 (117 total)
EDM = 5.04e+03 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
False	True	True	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	False	False	True

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	1.0	0.5
1	mu	1.0	0.4
2	sigma	0.55	0.13
3	nb	1.0	0.2
4	lambd	0.6	0.5

The problem is that some computations during minimization returned nan. Migrad can recover from that some extend, but it shouldn't happen.

Here, it happens for several reasons:

sigma must be positive, but no limit is set
lambd must be positive, but no limit is set

This is why Minuit supports parameter limits. Adding limits does not change the parameter values at the minimum, but it will distort the parameter uncertainties if the fit converges with a parameter value at its limit.

Let's try again with limits for sigma and lambd.

In [16]:

cost.verbose = 0
m = Minuit(cost, ns=1, mu=1, sigma=1, nb=1, lambd=1,
           limit_lambd=(0, None),
           limit_sigma=(0, None))
m.migrad()

/Users/hdembinski/Extern/iminuit/src/iminuit/cost.py:12: RuntimeWarning: invalid value encountered in log
  return np.log(x + log_const)

Out[16]:

FCN = -4067		Ncalls = 1081 (1081 total)
EDM = 9.22 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
False	True	False	True
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	False	True	False

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	1.77e3	0.12e3
1	mu	0.338	0.033
2	sigma	0.30	0.04			0
3	nb	-27.3e3	1.8e3
4	lambd	10.1e3	0.6e3			0

This looks better, more green, but still a failure. This time, Minuit gave up because it ran over its call limit. We didn't set a call limit, but Minuit uses a heuristic in this case to not iterate forever. In fact, that was not the only problem, another NaN result was encountered.

Let's try again with an increased the call limit.

In [17]:

m = Minuit(cost, ns=1, mu=1, sigma=1, nb=1, lambd=1,
           limit_lambd=(0, None),
           limit_sigma=(0, None))
m.migrad(ncall=1e6)

Out[17]:

FCN = -4079		Ncalls = 489 (1581 total)
EDM = 6.27e-06 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
True	True	False	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	False	False	True

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	1.34e3	0.12e3
1	mu	0.373	0.010
2	sigma	0.247	0.017			0
3	nb	0.8e6	2.3e6
4	lambd	0.006e6	0.012e6			0

This time, Minuit converged to something, but it does not seem to be a minimum. The Hessian matrix is not positiv definite.

Let's plot this "solution".

In [18]:

def plot_model(xe, cdf): # helper function
    plt.plot(xe, np.append(np.diff(cdf), np.nan), drawstyle="steps-post")

plt.hist(x, bins=30, range=(0, 1));
plot_model(xe, total(xe, *m.values[:]))

The fit clearly looks bad. In fact, we only see the signal. The background has a huge amplitude, but also a huge lambda, making it effectively a delta peak at zero. That is not a plausible solution.

We need better starting values. Let's mask out the signal and fit only the background. The cost function has a mask attribute for that.

In [19]:

# signal window is roughly 0.3 to 0.7, let's mask that out
cx = 0.5 * (xe[1:] + xe[:-1]) # compute bin centers

cost.mask = (cx < 0.3) | (0.7 < cx)

cost.verbose = 0 # turn verbosity off again, this fit should not cause trouble

m = Minuit(cost, ns=1, mu=1, sigma=1, nb=1, lambd=1,
           limit_lambd=(0, None),
           limit_sigma=(0, None))

# fix signal parameters for next fit and set signal amplitude to zero
m.fixed[:3] = True
m.values["ns"] = 0

m.migrad()

Out[19]:

FCN = -1397		Ncalls = 159 (159 total)
EDM = 6.32e-07 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
True	True	False	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	True	True	False

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	0.00	0.01					yes
1	mu	1.00	0.01					yes
2	sigma	1.00	0.01			0		yes
3	nb	1.01e3	0.06e3
4	lambd	0.55	0.04			0

In [20]:

plt.hist(x, bins=30, range=(0, 1));
plot_model(xe, total(xe, *m.values[:]))

Much better! We now unmask the signal, fix the background parameters and only fit the signal.

In [21]:

cost.mask = None
m.fixed[:] = False
m.fixed["nb"] = True
m.fixed["lambd"] = True
m.values["ns"] = 1
m.migrad()

Out[21]:

FCN = -4292		Ncalls = 227 (386 total)
EDM = 3.47e-06 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
True	True	False	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	True	True	False

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	510	27
1	mu	0.4990	0.0028
2	sigma	0.0467	0.0028			0
3	nb	1.01e3	0.06e3					yes
4	lambd	0.55	0.04			0		yes

In [22]:

plt.hist(x, bins=30, range=(0, 1));
plot_model(xe, total(xe, *m.values[:]))

This looks great! To wrap things up, we let Minuit optimize everything together to get the final values and uncertainties.

In [23]:

m.fixed[:] = False
m.migrad()

Out[23]:

FCN = -4293		Ncalls = 97 (483 total)
EDM = 2.55e-06 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
True	True	False	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	True	True	False

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	501	29
1	mu	0.4991	0.0028
2	sigma	0.0458	0.0029			0
3	nb	1.05e3	0.05e3
4	lambd	0.56	0.04			0

The values changed a bit and the uncertaintes of the signal parameters became larger. That is expected unless the parameters are independent.

In [24]:

plt.hist(x, bins=30, range=(0, 1));
plot_model(xe, total(xe, *m.values[:]))

🎉🎉🎉

Final note: set limits appropriately¶

We could have gotten the good result right away by setting appropriate limits on all parameters.

In [25]:

m = Minuit(cost, ns=1, mu=1, sigma=1, nb=1, lambd=1,
           limit_ns=(0, None),
           limit_nb=(0, None),  
           limit_mu=(0.4, 0.6),
           limit_lambd=(0, 2),
           limit_sigma=(0, 0.2))
m.migrad()

Out[25]:

FCN = -4293		Ncalls = 626 (626 total)
EDM = 3.46e-05 (Goal: 0.0001)		up = 0.5
Valid Min.	Valid Param.	Above EDM	Reached call limit
True	True	False	False
Hesse failed	Has cov.	Accurate	Pos. def.	Forced
False	True	True	True	False

	Name	Value	Hesse Error	Minos Error-	Minos Error+	Limit-	Limit+	Fixed
0	ns	501	29			0
1	mu	0.4991	0.0028			0.4	0.6
2	sigma	0.0457	0.0029			0	0.2
3	nb	1.05e3	0.05e3			0
4	lambd	0.56	0.04			0	2

In [26]:

plt.hist(x, bins=30, range=(0, 1));
plot_model(xe, total(xe, *m.values[:]))

As you can see, the result is the same with and without limits within the precision that matters.

Some sources (including the Minuit2 user guide) warn about setting limits carelessly. That advice is correct, but unless a parameter converges to a value very close to a limit boundary, the result with and without limits will be effectively the same. For the parameter value itself, this is a guaranteed proven property. Only the parameter uncertainty may be off.

Bonus: iminuit and resample¶

In [27]:

from resample import bootstrap

cost.n = np.histogram(x, bins=30, range=(0, 1))[0]
m = Minuit(cost, ns=1, mu=1, sigma=1, nb=1, lambd=1,
           limit_ns=(0, None),
           limit_nb=(0, None),  
           limit_mu=(0.4, 0.6),
           limit_lambd=(0, 2),
           limit_sigma=(0, 0.2))
m.migrad()
print(f"nfcn for original fit = {m.fmin.nfcn}")

errors = m.errors[:]
nfcn = m.fmin.nfcn

rng = np.random.default_rng(1)
b_value = []
b_nfcn = []
for i, b_sample in enumerate(bootstrap.resample(x, size=50,
                                                random_state=rng)):
    cost.n = np.histogram(b_sample, bins=30, range=(0, 1))[0]
    m.migrad()
    print(f"nfcn for fit of b_sample[{i}] = {m.fmin.nfcn}")
    assert m.valid
    b_value.append(m.values[:])
    b_nfcn.append(m.fmin.nfcn)

nfcn for original fit = 626
nfcn for fit of b_sample[0] = 93
nfcn for fit of b_sample[1] = 37
nfcn for fit of b_sample[2] = 35
nfcn for fit of b_sample[3] = 33
nfcn for fit of b_sample[4] = 34
nfcn for fit of b_sample[5] = 47
nfcn for fit of b_sample[6] = 46
nfcn for fit of b_sample[7] = 46
nfcn for fit of b_sample[8] = 46
nfcn for fit of b_sample[9] = 34
nfcn for fit of b_sample[10] = 47
nfcn for fit of b_sample[11] = 48
nfcn for fit of b_sample[12] = 46
nfcn for fit of b_sample[13] = 47
nfcn for fit of b_sample[14] = 109
nfcn for fit of b_sample[15] = 80
nfcn for fit of b_sample[16] = 78
nfcn for fit of b_sample[17] = 47
nfcn for fit of b_sample[18] = 46
nfcn for fit of b_sample[19] = 48
nfcn for fit of b_sample[20] = 78
nfcn for fit of b_sample[21] = 47
nfcn for fit of b_sample[22] = 80
nfcn for fit of b_sample[23] = 83
nfcn for fit of b_sample[24] = 45
nfcn for fit of b_sample[25] = 80
nfcn for fit of b_sample[26] = 79
nfcn for fit of b_sample[27] = 96
nfcn for fit of b_sample[28] = 93
nfcn for fit of b_sample[29] = 94
nfcn for fit of b_sample[30] = 48
nfcn for fit of b_sample[31] = 37
nfcn for fit of b_sample[32] = 78
nfcn for fit of b_sample[33] = 35
nfcn for fit of b_sample[34] = 47
nfcn for fit of b_sample[35] = 96
nfcn for fit of b_sample[36] = 46
nfcn for fit of b_sample[37] = 81
nfcn for fit of b_sample[38] = 47
nfcn for fit of b_sample[39] = 35
nfcn for fit of b_sample[40] = 81
nfcn for fit of b_sample[41] = 49
nfcn for fit of b_sample[42] = 34
nfcn for fit of b_sample[43] = 82
nfcn for fit of b_sample[44] = 48
nfcn for fit of b_sample[45] = 35
nfcn for fit of b_sample[46] = 50
nfcn for fit of b_sample[47] = 47
nfcn for fit of b_sample[48] = 95
nfcn for fit of b_sample[49] = 35

Migrad remembers previous fit result and converges quickly to nearby bootstrapped result.

In [28]:

plt.axvline(nfcn, color="C0", label="original")
plt.hist(b_nfcn, color="C1", label="bootstrapped")
plt.xlabel("number of likelihood evaluations")
plt.legend();

In [29]:

b_cov = np.cov(np.transpose(b_value))
b_errors = np.diag(b_cov) ** 0.5

height = 0.35
i = np.arange(5)
plt.barh(i - height/2, errors, height, label="Minuit")
plt.barh(i + height/2, b_errors, height, label="bootstrap")
plt.semilogx()
plt.yticks(i, m.parameters)
plt.xlabel("uncertainty estimate")
plt.legend();

Future plans¶

Replace Cython bindings with pybind11 bindings
- Cython is lagging behind C++ features
- iminuit requires lots of workarounds because of missing features in Cython
- pybind11 is not a separate code generator, just C++
Support unicode parameters
Support pickling of Minuit object

PS: Check out more great iminuit tutorials¶

https://nbviewer.jupyter.org/github/scikit-hep/iminuit/tree/master/tutorial/

How Hesse and Minos really work, which to use when, and is Minos really better? The answer may surprise you.
How to use the new cost functions
How to combine automatic differentation with JAX with iminuit
How to use numba to accelerate cost functions