Organisation and simultaneous fits: using simultaneous pdfs to describe simultaneous fits to multiple datasets
Author: Wouter Verkerke
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, March 19, 2024 at 07:16 PM.
%%cpp -d
#include "RooRealVar.h"
#include "RooDataSet.h"
#include "RooGaussian.h"
#include "RooChebychev.h"
#include "RooAddPdf.h"
#include "RooSimultaneous.h"
#include "RooCategory.h"
#include "TCanvas.h"
#include "TAxis.h"
#include "RooPlot.h"
using namespace RooFit;
Create observables
RooRealVar x("x", "x", -8, 8);
Construct signal pdf
RooRealVar mean("mean", "mean", 0, -8, 8);
RooRealVar sigma("sigma", "sigma", 0.3, 0.1, 10);
RooGaussian gx("gx", "gx", x, mean, sigma);
Construct background pdf
RooRealVar a0("a0", "a0", -0.1, -1, 1);
RooRealVar a1("a1", "a1", 0.004, -1, 1);
RooChebychev px("px", "px", x, RooArgSet(a0, a1));
Construct composite pdf
RooRealVar f("f", "f", 0.2, 0., 1.);
RooAddPdf model("model", "model", RooArgList(gx, px), f);
Construct signal pdf. NOTE that sigma is shared with the signal sample model
RooRealVar mean_ctl("mean_ctl", "mean_ctl", -3, -8, 8);
RooGaussian gx_ctl("gx_ctl", "gx_ctl", x, mean_ctl, sigma);
Construct the background pdf
RooRealVar a0_ctl("a0_ctl", "a0_ctl", -0.1, -1, 1);
RooRealVar a1_ctl("a1_ctl", "a1_ctl", 0.5, -0.1, 1);
RooChebychev px_ctl("px_ctl", "px_ctl", x, RooArgSet(a0_ctl, a1_ctl));
Construct the composite model
RooRealVar f_ctl("f_ctl", "f_ctl", 0.5, 0., 1.);
RooAddPdf model_ctl("model_ctl", "model_ctl", RooArgList(gx_ctl, px_ctl), f_ctl);
Generate 1000 events in x and y from model
std::unique_ptr<RooDataSet> data{model.generate({x}, 1000)};
std::unique_ptr<RooDataSet> data_ctl{model_ctl.generate({x}, 2000)};
input_line_55:2:2: warning: 'data' shadows a declaration with the same name in the 'std' namespace; use '::data' to reference this declaration std::unique_ptr<RooDataSet> data{model.generate({x}, 1000)}; ^
Define category to distinguish physics and control samples events
RooCategory sample("sample", "sample");
sample.defineType("physics");
sample.defineType("control");
input_line_56:2:2: warning: 'sample' shadows a declaration with the same name in the 'std' namespace; use '::sample' to reference this declaration RooCategory sample("sample", "sample"); ^
Construct combined dataset in (x,sample)
RooDataSet combData("combData", "combined data", x, Index(sample),
Import({{"physics", data.get()}, {"control", data_ctl.get()}}));
input_line_57:2:60: error: reference to 'sample' is ambiguous RooDataSet combData("combData", "combined data", x, Index(sample), ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_57:3:41: error: reference to 'data' is ambiguous Import({{"physics", data.get()}, {"control", data_ctl.get()}})); ^ input_line_55:2:30: note: candidate found by name lookup is 'data' std::unique_ptr<RooDataSet> data{model.generate({x}, 1000)}; ^ /usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data' data(initializer_list<_Tp> __il) noexcept ^ /usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data' data(_Container& __cont) noexcept(noexcept(__cont.data())) ^ /usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data' data(const _Container& __cont) noexcept(noexcept(__cont.data())) ^ /usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data' data(_Tp (&__array)[_Nm]) noexcept ^
Construct a simultaneous pdf using category sample as index: associate model with the physics state and model_ctl with the control state
RooSimultaneous simPdf("simPdf", "simultaneous pdf", {{"physics", &model}, {"control", &model_ctl}}, sample);
input_line_58:2:103: error: reference to 'sample' is ambiguous RooSimultaneous simPdf("simPdf", "simultaneous pdf", {{"physics", &model}, {"control", &model_ctl}}, sample); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^
Perform simultaneous fit of model to data and model_ctl to data_ctl
std::unique_ptr<RooFitResult> fitResult{simPdf.fitTo(combData, PrintLevel(-1), Save(), PrintLevel(-1))};
fitResult->Print();
input_line_59:2:65: error: cannot initialize an array element of type 'void *' with an rvalue of type 'RooCmdArg (*)(Int_t)' (aka 'RooCmdArg (*)(int)') std::unique_ptr<RooFitResult> fitResult{simPdf.fitTo(combData, PrintLevel(-1), Save(), PrintLevel(-1))}; ^~~~~~~~~~ input_line_59:2:81: error: cannot initialize an array element of type 'void *' with an rvalue of type 'RooCmdArg (*)(bool)' std::unique_ptr<RooFitResult> fitResult{simPdf.fitTo(combData, PrintLevel(-1), Save(), PrintLevel(-1))}; ^~~~ input_line_59:2:89: error: cannot initialize an array element of type 'void *' with an rvalue of type 'RooCmdArg (*)(Int_t)' (aka 'RooCmdArg (*)(int)') std::unique_ptr<RooFitResult> fitResult{simPdf.fitTo(combData, PrintLevel(-1), Save(), PrintLevel(-1))}; ^~~~~~~~~~
Make a frame for the physics sample
RooPlot *frame1 = x.frame(Title("Physics sample"));
Plot all data tagged as physics sample
combData.plotOn(frame1, Cut("sample==sample::physics"));
input_line_61:2:26: error: cannot initialize an array element of type 'void *' with an rvalue of type 'RooCmdArg (*)(const char *)' combData.plotOn(frame1, Cut("sample==sample::physics")); ^~~
Plot "physics" slice of simultaneous pdf. NBL You must project the sample index category with data using ProjWData as a RooSimultaneous makes no prediction on the shape in the index category and can thus not be integrated. In other words: Since the PDF doesn't know the number of events in the different category states, it doesn't know how much of each component it has to project out. This information is read from the data.
simPdf.plotOn(frame1, Slice(sample, "physics"), ProjWData(sample, combData));
simPdf.plotOn(frame1, Slice(sample, "physics"), Components("px"), ProjWData(sample, combData), LineStyle(kDashed));
input_line_62:2:30: error: reference to 'sample' is ambiguous simPdf.plotOn(frame1, Slice(sample, "physics"), ProjWData(sample, combData)); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_62:2:60: error: reference to 'sample' is ambiguous simPdf.plotOn(frame1, Slice(sample, "physics"), ProjWData(sample, combData)); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_62:2:68: error: use of undeclared identifier 'combData' simPdf.plotOn(frame1, Slice(sample, "physics"), ProjWData(sample, combData)); ^ input_line_62:3:1: error: use of undeclared identifier 'simPdf' simPdf.plotOn(frame1, Slice(sample, "physics"), Components("px"), ProjWData(sample, combData), LineStyle(kDashed)); ^ input_line_62:3:29: error: reference to 'sample' is ambiguous simPdf.plotOn(frame1, Slice(sample, "physics"), Components("px"), ProjWData(sample, combData), LineStyle(kDashed)); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_62:3:77: error: reference to 'sample' is ambiguous simPdf.plotOn(frame1, Slice(sample, "physics"), Components("px"), ProjWData(sample, combData), LineStyle(kDashed)); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_62:3:85: error: use of undeclared identifier 'combData' simPdf.plotOn(frame1, Slice(sample, "physics"), Components("px"), ProjWData(sample, combData), LineStyle(kDashed)); ^
The same plot for the control sample slice. We do this with a different approach this time, for illustration purposes. Here, we are slicing the dataset and then use the data slice for the projection, because then the RooFit::Slice() becomes unnecessary. This approach is more general, because you can plot sums of slices by using logical or in the Cut() command.
RooPlot *frame2 = x.frame(Bins(30), Title("Control sample"));
std::unique_ptr<RooAbsData> slicedData{combData.reduce(Cut("sample==sample::control"))};
slicedData->plotOn(frame2);
simPdf.plotOn(frame2, ProjWData(sample, *slicedData));
simPdf.plotOn(frame2, Components("px_ctl"), ProjWData(sample, *slicedData), LineStyle(kDashed));
input_line_63:5:33: error: reference to 'sample' is ambiguous simPdf.plotOn(frame2, ProjWData(sample, *slicedData)); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_63:6:1: error: use of undeclared identifier 'simPdf' simPdf.plotOn(frame2, Components("px_ctl"), ProjWData(sample, *slicedData), LineStyle(kDashed)); ^ input_line_63:6:55: error: reference to 'sample' is ambiguous simPdf.plotOn(frame2, Components("px_ctl"), ProjWData(sample, *slicedData), LineStyle(kDashed)); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_63:3:56: error: cannot initialize an array element of type 'void *' with an rvalue of type 'RooCmdArg (*)(const char *)' std::unique_ptr<RooAbsData> slicedData{combData.reduce(Cut("sample==sample::control"))}; ^~~
The same plot for all the phase space. Here, we can just use the original combined dataset.
RooPlot *frame3 = x.frame(Title("Both samples"));
combData.plotOn(frame3);
simPdf.plotOn(frame3, ProjWData(sample, combData));
simPdf.plotOn(frame3, Components("px,px_ctl"), ProjWData(sample, combData),
LineStyle(kDashed));
TCanvas *c = new TCanvas("rf501_simultaneouspdf", "rf403_simultaneouspdf", 1200, 400);
c->Divide(3);
auto draw = [&](int i, RooPlot & frame) {
c->cd(i);
gPad->SetLeftMargin(0.15);
frame.GetYaxis()->SetTitleOffset(1.4);
frame.Draw();
};
draw(1, *frame1);
draw(2, *frame2);
draw(3, *frame3);
input_line_64:4:33: error: reference to 'sample' is ambiguous simPdf.plotOn(frame3, ProjWData(sample, combData)); ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_64:4:41: error: use of undeclared identifier 'combData' simPdf.plotOn(frame3, ProjWData(sample, combData)); ^ input_line_64:5:1: error: use of undeclared identifier 'simPdf' simPdf.plotOn(frame3, Components("px,px_ctl"), ProjWData(sample, combData), ^ input_line_64:5:58: error: reference to 'sample' is ambiguous simPdf.plotOn(frame3, Components("px,px_ctl"), ProjWData(sample, combData), ^ input_line_56:2:14: note: candidate found by name lookup is 'sample' RooCategory sample("sample", "sample"); ^ /usr/include/c++/9/bits/stl_algo.h:5855:5: note: candidate found by name lookup is 'std::sample' sample(_PopulationIterator __first, _PopulationIterator __last, ^ input_line_64:5:66: error: use of undeclared identifier 'combData' simPdf.plotOn(frame3, Components("px,px_ctl"), ProjWData(sample, combData), ^ input_line_64:17:10: error: use of undeclared identifier 'frame2' draw(2, *frame2); ^
Draw all canvases
%jsroot on
gROOT->GetListOfCanvases()->Draw()