Rf 6 1 2_Recover From Invalid Parameters

Likelihood and minimization: Recover from regions where the function is not defined.

We demonstrate improved recovery from disallowed parameters. For this, we use a polynomial PDF of the form \mathrm{Pol2} = \mathcal{N} \left( c + a_1 \cdot x + a_2 \cdot x^2 + 0.01 \cdot x^3 \right), where $ \mathcal{N} $ is a normalisation factor. Unless the parameters are chosen carefully, this function can be negative, and hence, it cannot be used as a PDF. In this case, RooFit passes an error to the minimiser, which might try to recover.

Before ROOT 6.24, RooFit always passed the highest function value that was encountered during the minimisation to the minimiser. If a parameter is far in a disallowed region, the minimiser has to blindly test various values of the parameters. It might find the correct values by chance, but often be unable to recover from bad starting values. Here, we use a model with such bad values.

Starting with ROOT 6.24, the minimiser receives more information. For example, when a PDF is negative, the magnitude of the "undershoot" is passed to the minimiser. The minimiser can use this to compute a gradient, which will eventually lead it out of the disallowed region. The steepness of this gradient can be chosen using RooFit::RecoverFromUndefinedRegions(double). A value of zero is equivalent to RooFit before ROOT 6.24. Positive values activate the recovery. Values between 1. and 10. were found to be a good default. If no argument is passed, RooFit uses 10.

Author: Stephan Hageboeck
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Saturday, November 28, 2020 at 11:05 AM.

In [1]:
%%cpp -d
#include <RooRealVar.h>
#include <RooPolynomial.h>
#include <RooPlot.h>
#include <RooDataSet.h>
#include <RooGlobalFunc.h>
#include <RooFitResult.h>
#include <RooMsgService.h>

#include <TCanvas.h>
#include <TLegend.h>
In [2]:
// Create a fit model:
  // The polynomial is notoriously unstable, because it can quickly go negative.
  // Since PDFs need to be positive, one often ends up with an unstable fit model.
  RooRealVar x("x", "x", -15, 15);
  RooRealVar a1("a1", "a1", -0.5, -10., 20.);
  RooRealVar a2("a2", "a2", 0.2, -10., 20.);
  RooRealVar a3("a3", "a3", 0.01);
  RooPolynomial pdf("pol3", "c + a1 * x + a2 * x*x + 0.01 * x*x*x", x, RooArgSet(a1, a2, a3));

  // Create toy data with all-positive coefficients:
  std::unique_ptr<RooDataSet> data(pdf.generate(x, 10000));

  // For plotting.
  // We create pointers to the plotted objects. We want these objects to leak out of the function,
  // so we can still see them after it returns.
  TCanvas* c = new TCanvas();
  RooPlot* frame = x.frame();
  data->plotOn(frame, RooFit::Name("data"));

  // Plotting a PDF with disallowed parameters doesn't work. We would get a lot of error messages.
  // Therefore, we disable plotting messages in RooFit's message streams:

  // RooFit before ROOT 6.24
  // --------------------------------
  // Before 6.24, RooFit wasn't able to recover from invalid parameters. The minimiser just errs around
  // the starting values of the parameters without finding any improvement.

  // Set up the parameters such that the PDF would come out negative. The PDF is now undefined.

  // Perform a fit:
  RooFitResult* fitWithoutRecovery = pdf.fitTo(*data, RooFit::Save(),
   RooFit::RecoverFromUndefinedRegions(0.), // This is how RooFit behaved prior to ROOT 6.24
   RooFit::PrintEvalErrors(-1), // We are expecting a lot of evaluation errors. -1 switches off printing.

  pdf.plotOn(frame, RooFit::LineColor(kRed), RooFit::Name("noRecovery"));

  // RooFit since ROOT 6.24
  // --------------------------------
  // The minimiser gets information about the "badness" of the violation of the function definition. It uses this
  // to find its way out of the disallowed parameter regions.
  std::cout << "\n\n\n-------------- Starting second fit ---------------\n\n" << std::endl;

  // Reset the parameters such that the PDF is again undefined.

  // Fit again, but pass recovery information to the minimiser:
  RooFitResult* fitWithRecovery = pdf.fitTo(*data, RooFit::Save(),
   RooFit::RecoverFromUndefinedRegions(1.), // The magnitude of the recovery information can be chosen here.
                                             // Higher values mean more aggressive recovery.
   RooFit::PrintEvalErrors(-1), // We are still expecting a few evaluation errors.

  pdf.plotOn(frame, RooFit::LineColor(kBlue), RooFit::Name("recovery"));

  // Collect results and plot.
  // --------------------------------
  // We print the two fit results, and plot the fitted curves.
  // The curve of the fit without recovery cannot be plotted, because the PDF is undefined if a2 < 0.
  std::cout << "Without recovery, the fitter encountered " << fitWithoutRecovery->numInvalidNLL()
   << " invalid function values. The parameters are unchanged." << std::endl;

  std::cout << "With recovery, the fitter encountered " << fitWithRecovery->numInvalidNLL()
   << " invalid function values, but the parameters are fitted." << std::endl;

  TLegend* legend = new TLegend(0.5, 0.7, 0.9, 0.9);
  legend->AddEntry(frame->findObject("data"), "Data", "P");
  legend->AddEntry(frame->findObject("noRecovery"), "Without recovery (cannot be plotted)", "L");
  legend->AddEntry(frame->findObject("recovery"), "With recovery", "L");
RooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby 
                Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
                All rights reserved, please read http://roofit.sourceforge.net/license.txt

[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization
[#0] ERROR:Eval -- RooAbsReal::logEvalError(pol3) evaluation error, 
 origin       : RooPolynomial::pol3[ x=x coefList=(a1,a2,a3) ]
 message      : p.d.f normalization integral is zero or negative: -2220.000000
 server values: x=x=0, coefList=(a1 = 2.60781 +/- 11.9002,a2 = -1 +/- 11.5683,a3 = 0.01)

-------------- Starting second fit ---------------

[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
 **   15 **SET NOGRAD
    NO.   NAME         VALUE      STEP SIZE      LIMITS
     1 a1           1.00000e+01  1.19002e+01   -1.00000e+01  2.00000e+01
     2 a2          -1.00000e+00  1.15683e+01   -1.00000e+01  2.00000e+01
 **   16 **SET ERR         0.5
 **   17 **SET PRINT           0
 **   18 **SET STR           1
 **   19 **MIGRAD        1000           1
 FCN=-858.564 FROM MIGRAD    STATUS=CONVERGED     243 CALLS         244 TOTAL
                     EDM=7.33131e-05    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                   STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  a1          -4.98209e-01   2.27025e-02   2.77075e-05  -2.18163e+00
   2  a2           1.98271e-01   5.64128e-03   1.48249e-05  -2.54212e+01
                               ERR DEF= 0.5
 **   20 **SET ERR         0.5
 **   21 **SET PRINT           0
 **   22 **HESSE        1000
 FCN=-858.564 FROM HESSE     STATUS=OK             10 CALLS         254 TOTAL
                     EDM=7.3377e-05    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                INTERNAL      INTERNAL  
  NO.   NAME      VALUE            ERROR       STEP SIZE       VALUE   
   1  a1          -4.98209e-01   2.27254e-02   8.14968e-06  -3.41822e+01
   2  a2           1.98271e-01   5.64697e-03   7.41245e-06   3.10901e+01
                               ERR DEF= 0.5
[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization

  RooFitResult: minimized FCN value: 0, estimated distance to minimum: 0
                covariance matrix quality: Approximation only, not accurate
                Status : MINIMIZE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                    a1    2.6078e+00 +/-  1.19e+01
                    a2   -1.0000e+00 +/-  1.16e+01

Without recovery, the fitter encountered 66 invalid function values. The parameters are unchanged.

  RooFitResult: minimized FCN value: 29650.9, estimated distance to minimum: 7.3377e-05
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                    a1   -4.9821e-01 +/-  2.27e-02
                    a2    1.9827e-01 +/-  5.65e-03

With recovery, the fitter encountered 72 invalid function values, but the parameters are fitted.

Draw all canvases

In [3]:
%jsroot on