T M V A Classification Category

This macro provides examples for the training and testing of the TMVA classifiers in categorisation mode.

  • Project : TMVA - a Root-integrated toolkit for multivariate data analysis
  • Package : TMVA
  • Root Macro: TMVAClassificationCategory

As input data is used a toy-MC sample consisting of four Gaussian-distributed and linearly correlated input variables with category (eta) dependent properties.

For this example, only Fisher and Likelihood are used. Run via:

root -l TMVAClassificationCategory.C

The output file "TMVA.root" can be analysed with the use of dedicated macros (simply say: root -l <macro.C>), which can be conveniently invoked through a GUI that will appear at the end of the run of this macro.

Author: Andreas Hoecker
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Thursday, June 17, 2021 at 06:03 PM.

In [1]:
%%cpp -d
#include <cstdlib>
#include <iostream>
#include <map>
#include <string>

#include "TChain.h"
#include "TFile.h"
#include "TTree.h"
#include "TString.h"
#include "TObjString.h"
#include "TSystem.h"
#include "TROOT.h"

#include "TMVA/MethodCategory.h"
#include "TMVA/Factory.h"
#include "TMVA/DataLoader.h"
#include "TMVA/Tools.h"
#include "TMVA/TMVAGui.h"

Two types of category methods are implemented

In [2]:
Bool_t UseOffsetMethod = kTRUE;

Example for usage of different event categories with classifiers

In [3]:
std::cout << std::endl << "==> Start TMVAClassificationCategory" << std::endl;
==> Start TMVAClassificationCategory

This loads the library

In [4]:
TMVA::Tools::Instance();

bool batchMode = false;

Create a new root output file.

In [5]:
TString outfileName( "TMVA.root" );
TFile* outputFile = TFile::Open( outfileName, "RECREATE" );

Create the factory object (see tmvaclassification.c for more information)

In [6]:
std::string factoryOptions( "!V:!Silent:Transformations=I;D;P;G,D" );
  if (batchMode) factoryOptions += ":!Color:!DrawProgressBar";

TMVA::Factory *factory = new TMVA::Factory( "TMVAClassificationCategory", outputFile, factoryOptions );

Create dataloader

In [7]:
TMVA::DataLoader *dataloader=new TMVA::DataLoader("dataset");

Define the input variables used for the mva training

In [8]:
dataloader->AddVariable( "var1", 'F' );
dataloader->AddVariable( "var2", 'F' );
dataloader->AddVariable( "var3", 'F' );
dataloader->AddVariable( "var4", 'F' );

You can add so-called "spectator variables", which are not used in the mva training, but will appear in the final "TestTree" produced by TMVA. This TestTree will contain the input variables, the response values of all trained MVAs, and the spectator variables

In [9]:
dataloader->AddSpectator( "eta" );

Load the signal and background event samples from root trees

In [10]:
TFile *input(0);
TString fname = gSystem->GetDirName(__FILE__) + "/data/";
if (gSystem->AccessPathName( fname + "toy_sigbkg_categ_offset.root")) {
   // if directory data not found try using tutorials dir
   fname = gROOT->GetTutorialDir() + "/tmva/data/";
}
if (UseOffsetMethod) fname += "toy_sigbkg_categ_offset.root";
else                 fname += "toy_sigbkg_categ_varoff.root";
if (!gSystem->AccessPathName( fname )) {
   // first we try to find tmva_example.root in the local directory
   std::cout << "--- TMVAClassificationCategory: Accessing " << fname << std::endl;
   input = TFile::Open( fname );
}

if (!input) {
   std::cout << "ERROR: could not open data file: " << fname << std::endl;
   exit(1);
}

TTree *signalTree     = (TTree*)input->Get("TreeS");
TTree *background = (TTree*)input->Get("TreeB");
--- TMVAClassificationCategory: Accessing /home/sftnight/build/workspace/root-makedoc-master/rootspi/rdoc/src/master.build/tutorials/tmva/data/toy_sigbkg_categ_offset.root

Global event weights per tree (see below for setting event-wise weights)

In [11]:
Double_t signalWeight     = 1.0;
Double_t backgroundWeight = 1.0;

You can add an arbitrary number of signal or background trees

In [12]:
dataloader->AddSignalTree    ( signalTree,     signalWeight     );
dataloader->AddBackgroundTree( background, backgroundWeight );
<HEADER> DataSetInfo              : [dataset] : Added class "Signal"
                         : Add Tree TreeS of type Signal with 10000 events
<HEADER> DataSetInfo              : [dataset] : Added class "Background"
                         : Add Tree TreeB of type Background with 10000 events

Apply additional cuts on the signal and background samples (can be different)

In [13]:
TCut mycuts = ""; // for example: TCut mycuts = "abs(var1)<0.5 && abs(var2-0.5)<1";
TCut mycutb = ""; // for example: TCut mycutb = "abs(var1)<0.5";

Tell the factory how to use the training and testing events

In [14]:
dataloader->PrepareTrainingAndTestTree( mycuts, mycutb,
                                     "nTrain_Signal=0:nTrain_Background=0:SplitMode=Random:NormMode=NumEvents:!V" );

Book mva methods

Fisher discriminant

In [15]:
factory->BookMethod( dataloader, TMVA::Types::kFisher, "Fisher", "!H:!V:Fisher" );
<HEADER> Factory                  : Booking method: Fisher
                         : 

Likelihood

In [16]:
factory->BookMethod( dataloader, TMVA::Types::kLikelihood, "Likelihood",
                     "!H:!V:TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50" );
<HEADER> Factory                  : Booking method: Likelihood
                         : 

Categorised classifier

In [17]:
TMVA::MethodCategory* mcat = 0;

The variable sets

In [18]:
TString theCat1Vars = "var1:var2:var3:var4";
TString theCat2Vars = (UseOffsetMethod ? "var1:var2:var3:var4" : "var1:var2:var3");

Fisher with categories

In [19]:
TMVA::MethodBase* fiCat = factory->BookMethod( dataloader, TMVA::Types::kCategory, "FisherCat","" );
mcat = dynamic_cast<TMVA::MethodCategory*>(fiCat);
mcat->AddMethod( "abs(eta)<=1.3", theCat1Vars, TMVA::Types::kFisher, "Category_Fisher_1","!H:!V:Fisher" );
mcat->AddMethod( "abs(eta)>1.3",  theCat2Vars, TMVA::Types::kFisher, "Category_Fisher_2","!H:!V:Fisher" );
<HEADER> Factory                  : Booking method: FisherCat
                         : 
                         : Adding sub-classifier: Fisher::Category_Fisher_1
<HEADER> DataSetInfo              : [Category_Fisher_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Fisher_1_dsi] : Added class "Background"
                         : Adding sub-classifier: Fisher::Category_Fisher_2
<HEADER> DataSetInfo              : [Category_Fisher_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Fisher_2_dsi] : Added class "Background"

Likelihood with categories

In [20]:
TMVA::MethodBase* liCat = factory->BookMethod( dataloader, TMVA::Types::kCategory, "LikelihoodCat","" );
mcat = dynamic_cast<TMVA::MethodCategory*>(liCat);
mcat->AddMethod( "abs(eta)<=1.3",theCat1Vars, TMVA::Types::kLikelihood,
                 "Category_Likelihood_1","!H:!V:TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50" );
mcat->AddMethod( "abs(eta)>1.3", theCat2Vars, TMVA::Types::kLikelihood,
                 "Category_Likelihood_2","!H:!V:TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50" );
<HEADER> Factory                  : Booking method: LikelihoodCat
                         : 
                         : Adding sub-classifier: Likelihood::Category_Likelihood_1
<HEADER> DataSetInfo              : [Category_Likelihood_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Likelihood_1_dsi] : Added class "Background"
                         : Adding sub-classifier: Likelihood::Category_Likelihood_2
<HEADER> DataSetInfo              : [Category_Likelihood_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Likelihood_2_dsi] : Added class "Background"

Now you can tell the factory to train, test, and evaluate the mvas

Train mvas using the set of training events

In [21]:
factory->TrainAllMethods();
<HEADER> Factory                  : Train all methods
                         : Rebuilding Dataset dataset
                         : Building event vectors for type 2 Signal
                         : Dataset[dataset] :  create input formulas for tree TreeS
                         : Building event vectors for type 2 Background
                         : Dataset[dataset] :  create input formulas for tree TreeB
<HEADER> DataSetFactory           : [dataset] : Number of events in input trees
                         : 
                         : 
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Signal     -- training events            : 5000
                         : Signal     -- testing events             : 5000
                         : Signal     -- training and testing events: 10000
                         : Background -- training events            : 5000
                         : Background -- testing events             : 5000
                         : Background -- training and testing events: 10000
                         : 
<HEADER> DataSetInfo              : Correlation matrix (Signal):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  +0.368  +0.378  +0.391
                         :    var2:  +0.368  +1.000  +0.388  +0.386
                         :    var3:  +0.378  +0.388  +1.000  +0.389
                         :    var4:  +0.391  +0.386  +0.389  +1.000
                         : ----------------------------------------
<HEADER> DataSetInfo              : Correlation matrix (Background):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  +0.365  +0.376  +0.381
                         :    var2:  +0.365  +1.000  +0.382  +0.387
                         :    var3:  +0.376  +0.382  +1.000  +0.376
                         :    var4:  +0.381  +0.387  +0.376  +1.000
                         : ----------------------------------------
<HEADER> DataSetFactory           : [dataset] :  
                         : 
<HEADER> Factory                  : Train method: Fisher for Classification
                         : 
<HEADER> Fisher                   : Results for Fisher coefficients:
                         : -----------------------
                         : Variable:  Coefficient:
                         : -----------------------
                         :     var1:       -0.053
                         :     var2:       -0.014
                         :     var3:       +0.096
                         :     var4:       +0.216
                         : (offset):       -0.023
                         : -----------------------
                         : Elapsed time for training with 10000 events: 0.00339 sec         
<HEADER> Fisher                   : [dataset] : Evaluation of Fisher on training sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00106 sec       
                         : Creating xml weight file: dataset/weights/TMVAClassificationCategory_Fisher.weights.xml
                         : Creating standalone class: dataset/weights/TMVAClassificationCategory_Fisher.class.C
<HEADER> Factory                  : Training finished
                         : 
<HEADER> Factory                  : Train method: Likelihood for Classification
                         : 
                         : Filling reference histograms
                         : Building PDF out of reference histograms
                         : Elapsed time for training with 10000 events: 0.0516 sec         
<HEADER> Likelihood               : [dataset] : Evaluation of Likelihood on training sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00842 sec       
                         : Creating xml weight file: dataset/weights/TMVAClassificationCategory_Likelihood.weights.xml
                         : Creating standalone class: dataset/weights/TMVAClassificationCategory_Likelihood.class.C
                         : TMVA.root:/dataset/Method_Likelihood/Likelihood
<HEADER> Factory                  : Training finished
                         : 
<HEADER> Factory                  : Train method: FisherCat for Classification
                         : 
                         : Train all sub-classifiers for Classification ...
                         : Rebuilding Dataset Category_Fisher_1_dsi
                         : Building event vectors for type 2 Signal
                         : Dataset[Category_Fisher_1_dsi] :  create input formulas for tree TreeS
                         : Building event vectors for type 2 Background
                         : Dataset[Category_Fisher_1_dsi] :  create input formulas for tree TreeB
<HEADER> DataSetFactory           : [Category_Fisher_1_dsi] : Number of events in input trees
                         : Dataset[Category_Fisher_1_dsi] :     Signal     requirement: "abs(eta)<=1.3"
                         : Dataset[Category_Fisher_1_dsi] :     Signal          -- number of events passed: 5123   / sum of weights: 5123 
                         : Dataset[Category_Fisher_1_dsi] :     Signal          -- efficiency             : 0.5123
                         : Dataset[Category_Fisher_1_dsi] :     Background requirement: "abs(eta)<=1.3"
                         : Dataset[Category_Fisher_1_dsi] :     Background      -- number of events passed: 5134   / sum of weights: 5134 
                         : Dataset[Category_Fisher_1_dsi] :     Background      -- efficiency             : 0.5134
                         : Dataset[Category_Fisher_1_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.5123 preselection efficiency)
                         : Dataset[Category_Fisher_1_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.5134 preselection efficiency)
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Signal     -- training events            : 2561
                         : Signal     -- testing events             : 2561
                         : Signal     -- training and testing events: 5122
                         : Dataset[Category_Fisher_1_dsi] : Signal     -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5123
                         : Background -- training events            : 2567
                         : Background -- testing events             : 2567
                         : Background -- training and testing events: 5134
                         : Dataset[Category_Fisher_1_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5134
                         : 
<HEADER> DataSetInfo              : Correlation matrix (Signal):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.017  +0.004  +0.001
                         :    var2:  -0.017  +1.000  -0.019  -0.003
                         :    var3:  +0.004  -0.019  +1.000  -0.012
                         :    var4:  +0.001  -0.003  -0.012  +1.000
                         : ----------------------------------------
<HEADER> DataSetInfo              : Correlation matrix (Background):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.019  -0.022  +0.003
                         :    var2:  -0.019  +1.000  -0.018  +0.004
                         :    var3:  -0.022  -0.018  +1.000  +0.004
                         :    var4:  +0.003  +0.004  +0.004  +1.000
                         : ----------------------------------------
<HEADER> DataSetFactory           : [Category_Fisher_1_dsi] :  
                         : 
                         : Train method: Category_Fisher_1 for Classification
<HEADER> Category_Fisher_1        : Results for Fisher coefficients:
                         : -----------------------
                         : Variable:  Coefficient:
                         : -----------------------
                         :     var1:       +0.105
                         :     var2:       +0.152
                         :     var3:       +0.247
                         :     var4:       +0.375
                         : (offset):       +0.648
                         : -----------------------
                         : Elapsed time for training with 5128 events: 0.00159 sec         
<HEADER> Category_Fisher_1        : [Category_Fisher_1_dsi] : Evaluation of Category_Fisher_1 on training sample (5128 events)
                         : Elapsed time for evaluation of 5128 events: 0.000588 sec       
                         : Training finished
                         : Rebuilding Dataset Category_Fisher_2_dsi
                         : Building event vectors for type 2 Signal
                         : Dataset[Category_Fisher_2_dsi] :  create input formulas for tree TreeS
                         : Building event vectors for type 2 Background
                         : Dataset[Category_Fisher_2_dsi] :  create input formulas for tree TreeB
<HEADER> DataSetFactory           : [Category_Fisher_2_dsi] : Number of events in input trees
                         : Dataset[Category_Fisher_2_dsi] :     Signal     requirement: "abs(eta)>1.3"
                         : Dataset[Category_Fisher_2_dsi] :     Signal          -- number of events passed: 4877   / sum of weights: 4877 
                         : Dataset[Category_Fisher_2_dsi] :     Signal          -- efficiency             : 0.4877
                         : Dataset[Category_Fisher_2_dsi] :     Background requirement: "abs(eta)>1.3"
                         : Dataset[Category_Fisher_2_dsi] :     Background      -- number of events passed: 4866   / sum of weights: 4866 
                         : Dataset[Category_Fisher_2_dsi] :     Background      -- efficiency             : 0.4866
                         : Dataset[Category_Fisher_2_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.4877 preselection efficiency)
                         : Dataset[Category_Fisher_2_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.4866 preselection efficiency)
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Signal     -- training events            : 2438
                         : Signal     -- testing events             : 2438
                         : Signal     -- training and testing events: 4876
                         : Dataset[Category_Fisher_2_dsi] : Signal     -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4877
                         : Background -- training events            : 2433
                         : Background -- testing events             : 2433
                         : Background -- training and testing events: 4866
                         : Dataset[Category_Fisher_2_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4866
                         : 
<HEADER> DataSetInfo              : Correlation matrix (Signal):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.005  +0.002  -0.039
                         :    var2:  -0.005  +1.000  +0.011  -0.004
                         :    var3:  +0.002  +0.011  +1.000  -0.021
                         :    var4:  -0.039  -0.004  -0.021  +1.000
                         : ----------------------------------------
<HEADER> DataSetInfo              : Correlation matrix (Background):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.007  +0.009  +0.008
                         :    var2:  -0.007  +1.000  -0.020  +0.013
                         :    var3:  +0.009  -0.020  +1.000  +0.007
                         :    var4:  +0.008  +0.013  +0.007  +1.000
                         : ----------------------------------------
<HEADER> DataSetFactory           : [Category_Fisher_2_dsi] :  
                         : 
                         : Train method: Category_Fisher_2 for Classification
<HEADER> Category_Fisher_2        : Results for Fisher coefficients:
                         : -----------------------
                         : Variable:  Coefficient:
                         : -----------------------
                         :     var1:       +0.107
                         :     var2:       +0.148
                         :     var3:       +0.251
                         :     var4:       +0.372
                         : (offset):       -0.751
                         : -----------------------
                         : Elapsed time for training with 4871 events: 0.0015 sec         
<HEADER> Category_Fisher_2        : [Category_Fisher_2_dsi] : Evaluation of Category_Fisher_2 on training sample (4871 events)
                         : Elapsed time for evaluation of 4871 events: 0.000548 sec       
                         : Training finished
                         : Begin ranking of input variables...
<HEADER> Category_Fisher_1        : Ranking result (top variable is best ranked)
                         : -------------------------------
                         : Rank : Variable  : Discr. power
                         : -------------------------------
                         :    1 : var4      : 2.205e-01
                         :    2 : var3      : 1.054e-01
                         :    3 : var2      : 4.114e-02
                         :    4 : var1      : 1.987e-02
                         : -------------------------------
<HEADER> Category_Fisher_2        : Ranking result (top variable is best ranked)
                         : -------------------------------
                         : Rank : Variable  : Discr. power
                         : -------------------------------
                         :    1 : var4      : 2.153e-01
                         :    2 : var3      : 1.105e-01
                         :    3 : var2      : 4.289e-02
                         :    4 : var1      : 1.986e-02
                         : -------------------------------
                         : Elapsed time for training with 10000 events: 0.0388 sec         
<HEADER> Category_Fisher_1        : [Category_Fisher_1_dsi] : Evaluation of Category_Fisher_1 on training sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00157 sec       
<HEADER> Category_Fisher_2        : [Category_Fisher_2_dsi] : Evaluation of Category_Fisher_2 on training sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.000944 sec       
                         : Creating xml weight file: dataset/weights/TMVAClassificationCategory_FisherCat.weights.xml
<HEADER> Factory                  : Training finished
                         : 
<HEADER> Factory                  : Train method: LikelihoodCat for Classification
                         : 
                         : Train all sub-classifiers for Classification ...
                         : Rebuilding Dataset Category_Likelihood_1_dsi
                         : Building event vectors for type 2 Signal
                         : Dataset[Category_Likelihood_1_dsi] :  create input formulas for tree TreeS
                         : Building event vectors for type 2 Background
                         : Dataset[Category_Likelihood_1_dsi] :  create input formulas for tree TreeB
<HEADER> DataSetFactory           : [Category_Likelihood_1_dsi] : Number of events in input trees
                         : Dataset[Category_Likelihood_1_dsi] :     Signal     requirement: "abs(eta)<=1.3"
                         : Dataset[Category_Likelihood_1_dsi] :     Signal          -- number of events passed: 5123   / sum of weights: 5123 
                         : Dataset[Category_Likelihood_1_dsi] :     Signal          -- efficiency             : 0.5123
                         : Dataset[Category_Likelihood_1_dsi] :     Background requirement: "abs(eta)<=1.3"
                         : Dataset[Category_Likelihood_1_dsi] :     Background      -- number of events passed: 5134   / sum of weights: 5134 
                         : Dataset[Category_Likelihood_1_dsi] :     Background      -- efficiency             : 0.5134
                         : Dataset[Category_Likelihood_1_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.5123 preselection efficiency)
                         : Dataset[Category_Likelihood_1_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.5134 preselection efficiency)
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Signal     -- training events            : 2561
                         : Signal     -- testing events             : 2561
                         : Signal     -- training and testing events: 5122
                         : Dataset[Category_Likelihood_1_dsi] : Signal     -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5123
                         : Background -- training events            : 2567
                         : Background -- testing events             : 2567
                         : Background -- training and testing events: 5134
                         : Dataset[Category_Likelihood_1_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.5134
                         : 
<HEADER> DataSetInfo              : Correlation matrix (Signal):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.017  +0.004  +0.001
                         :    var2:  -0.017  +1.000  -0.019  -0.003
                         :    var3:  +0.004  -0.019  +1.000  -0.012
                         :    var4:  +0.001  -0.003  -0.012  +1.000
                         : ----------------------------------------
<HEADER> DataSetInfo              : Correlation matrix (Background):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.019  -0.022  +0.003
                         :    var2:  -0.019  +1.000  -0.018  +0.004
                         :    var3:  -0.022  -0.018  +1.000  +0.004
                         :    var4:  +0.003  +0.004  +0.004  +1.000
                         : ----------------------------------------
<HEADER> DataSetFactory           : [Category_Likelihood_1_dsi] :  
                         : 
                         : Train method: Category_Likelihood_1 for Classification
                         : Filling reference histograms
                         : Building PDF out of reference histograms
                         : Elapsed time for training with 5128 events: 0.0293 sec         
<HEADER> Category_Likelihood_1    : [Category_Likelihood_1_dsi] : Evaluation of Category_Likelihood_1 on training sample (5128 events)
                         : Elapsed time for evaluation of 5128 events: 0.00396 sec       
                         : TMVA.root:/dataset/Method_Category/LikelihoodCat/Method_Likelihood/Category_Likelihood_1
                         : Training finished
                         : Rebuilding Dataset Category_Likelihood_2_dsi
                         : Building event vectors for type 2 Signal
                         : Dataset[Category_Likelihood_2_dsi] :  create input formulas for tree TreeS
                         : Building event vectors for type 2 Background
                         : Dataset[Category_Likelihood_2_dsi] :  create input formulas for tree TreeB
<HEADER> DataSetFactory           : [Category_Likelihood_2_dsi] : Number of events in input trees
                         : Dataset[Category_Likelihood_2_dsi] :     Signal     requirement: "abs(eta)>1.3"
                         : Dataset[Category_Likelihood_2_dsi] :     Signal          -- number of events passed: 4877   / sum of weights: 4877 
                         : Dataset[Category_Likelihood_2_dsi] :     Signal          -- efficiency             : 0.4877
                         : Dataset[Category_Likelihood_2_dsi] :     Background requirement: "abs(eta)>1.3"
                         : Dataset[Category_Likelihood_2_dsi] :     Background      -- number of events passed: 4866   / sum of weights: 4866 
                         : Dataset[Category_Likelihood_2_dsi] :     Background      -- efficiency             : 0.4866
                         : Dataset[Category_Likelihood_2_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.4877 preselection efficiency)
                         : Dataset[Category_Likelihood_2_dsi] :  you have opted for scaling the number of requested training/testing events
                         :  to be scaled by the preselection efficiency
                         :  ( 0 * 0.4866 preselection efficiency)
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Signal     -- training events            : 2438
                         : Signal     -- testing events             : 2438
                         : Signal     -- training and testing events: 4876
                         : Dataset[Category_Likelihood_2_dsi] : Signal     -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4877
                         : Background -- training events            : 2433
                         : Background -- testing events             : 2433
                         : Background -- training and testing events: 4866
                         : Dataset[Category_Likelihood_2_dsi] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.4866
                         : 
<HEADER> DataSetInfo              : Correlation matrix (Signal):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.005  +0.002  -0.039
                         :    var2:  -0.005  +1.000  +0.011  -0.004
                         :    var3:  +0.002  +0.011  +1.000  -0.021
                         :    var4:  -0.039  -0.004  -0.021  +1.000
                         : ----------------------------------------
<HEADER> DataSetInfo              : Correlation matrix (Background):
                         : ----------------------------------------
                         :             var1    var2    var3    var4
                         :    var1:  +1.000  -0.007  +0.009  +0.008
                         :    var2:  -0.007  +1.000  -0.020  +0.013
                         :    var3:  +0.009  -0.020  +1.000  +0.007
                         :    var4:  +0.008  +0.013  +0.007  +1.000
                         : ----------------------------------------
<HEADER> DataSetFactory           : [Category_Likelihood_2_dsi] :  
                         : 
                         : Train method: Category_Likelihood_2 for Classification
                         : Filling reference histograms
                         : Building PDF out of reference histograms
                         : Elapsed time for training with 4871 events: 0.0278 sec         
<HEADER> Category_Likelihood_2    : [Category_Likelihood_2_dsi] : Evaluation of Category_Likelihood_2 on training sample (4871 events)
                         : Elapsed time for evaluation of 4871 events: 0.00378 sec       
                         : TMVA.root:/dataset/Method_Category/LikelihoodCat/Method_Likelihood/Category_Likelihood_2
                         : Training finished
                         : Begin ranking of input variables...
<HEADER> Category_Likelihood_1    : Ranking result (top variable is best ranked)
                         : -----------------------------------
                         : Rank : Variable  : Delta Separation
                         : -----------------------------------
                         :    1 : var4      : 1.031e-01
                         :    2 : var3      : 1.716e-02
                         :    3 : var1      : 1.036e-02
                         :    4 : var2      : 4.428e-03
                         : -----------------------------------
<HEADER> Category_Likelihood_2    : Ranking result (top variable is best ranked)
                         : -----------------------------------
                         : Rank : Variable  : Delta Separation
                         : -----------------------------------
                         :    1 : var4      : 1.424e-01
                         :    2 : var3      : 6.035e-02
                         :    3 : var2      : 1.824e-02
                         :    4 : var1      : 8.110e-03
                         : -----------------------------------
                         : Elapsed time for training with 10000 events: 0.244 sec         
<HEADER> Category_Likelihood_1    : [Category_Likelihood_1_dsi] : Evaluation of Category_Likelihood_1 on training sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00795 sec       
<HEADER> Category_Likelihood_2    : [Category_Likelihood_2_dsi] : Evaluation of Category_Likelihood_2 on training sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00744 sec       
                         : Creating xml weight file: dataset/weights/TMVAClassificationCategory_LikelihoodCat.weights.xml
<HEADER> Factory                  : Training finished
                         : 
                         : Ranking input variables (method specific)...
<HEADER> Fisher                   : Ranking result (top variable is best ranked)
                         : -------------------------------
                         : Rank : Variable  : Discr. power
                         : -------------------------------
                         :    1 : var4      : 1.446e-01
                         :    2 : var3      : 7.153e-02
                         :    3 : var2      : 2.447e-02
                         :    4 : var1      : 1.243e-02
                         : -------------------------------
<HEADER> Likelihood               : Ranking result (top variable is best ranked)
                         : -----------------------------------
                         : Rank : Variable  : Delta Separation
                         : -----------------------------------
                         :    1 : var4      : 1.162e-01
                         :    2 : var3      : 5.179e-02
                         :    3 : var2      : 2.915e-02
                         :    4 : var1      : 2.168e-02
                         : -----------------------------------
                         : No variable ranking supplied by classifier: FisherCat
                         : No variable ranking supplied by classifier: LikelihoodCat
<HEADER> Factory                  : === Destroy and recreate all methods via weight files for testing ===
                         : 
                         : Reading weight file: dataset/weights/TMVAClassificationCategory_Fisher.weights.xml
                         : Reading weight file: dataset/weights/TMVAClassificationCategory_Likelihood.weights.xml
                         : Reading weight file: dataset/weights/TMVAClassificationCategory_FisherCat.weights.xml
                         : Recreating sub-classifiers from XML-file 
<HEADER> DataSetInfo              : [Category_Fisher_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Fisher_1_dsi] : Added class "Background"
<HEADER> DataSetInfo              : [Category_Fisher_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Fisher_2_dsi] : Added class "Background"
                         : Reading weight file: dataset/weights/TMVAClassificationCategory_LikelihoodCat.weights.xml
                         : Recreating sub-classifiers from XML-file 
<HEADER> DataSetInfo              : [Category_Likelihood_1_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Likelihood_1_dsi] : Added class "Background"
<HEADER> DataSetInfo              : [Category_Likelihood_2_dsi] : Added class "Signal"
<HEADER> DataSetInfo              : [Category_Likelihood_2_dsi] : Added class "Background"
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
6%, time left: 0 sec
12%, time left: 0 sec
18%, time left: 0 sec
25%, time left: 0 sec
31%, time left: 0 sec
37%, time left: 0 sec
43%, time left: 0 sec
50%, time left: 0 sec
56%, time left: 0 sec
62%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
81%, time left: 0 sec
87%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
6%, time left: 0 sec
12%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
31%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
81%, time left: 0 sec
87%, time left: 0 sec
94%, time left: 0 sec
Info in <TMVA::MethodCategory::GetMVaValues>: Evaluate MethodCategory for 10000 events type 0 on the dataset dataset
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
6%, time left: 0 sec
12%, time left: 0 sec
18%, time left: 0 sec
25%, time left: 0 sec
31%, time left: 0 sec
37%, time left: 0 sec
43%, time left: 0 sec
50%, time left: 0 sec
56%, time left: 0 sec
62%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
81%, time left: 0 sec
87%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
6%, time left: 0 sec
12%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
31%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
81%, time left: 0 sec
87%, time left: 0 sec
94%, time left: 0 sec
Info in <TMVA::MethodCategory::GetMVaValues>: Evaluate MethodCategory for 10000 events type 0 on the dataset dataset
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec

Evaluate all mvas using the set of test events

In [22]:
factory->TestAllMethods();
<HEADER> Factory                  : Test all methods
<HEADER> Factory                  : Test method: Fisher for Classification performance
                         : 
<HEADER> Fisher                   : [dataset] : Evaluation of Fisher on testing sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00193 sec       
<HEADER> Factory                  : Test method: Likelihood for Classification performance
                         : 
<HEADER> Likelihood               : [dataset] : Evaluation of Likelihood on testing sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00777 sec       
<HEADER> Factory                  : Test method: FisherCat for Classification performance
                         : 
<HEADER> Category_Fisher_1        : [Category_Fisher_1_dsi] : Evaluation of Category_Fisher_1 on testing sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.000995 sec       
<HEADER> Category_Fisher_2        : [Category_Fisher_2_dsi] : Evaluation of Category_Fisher_2 on testing sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.000976 sec       
<HEADER> Factory                  : Test method: LikelihoodCat for Classification performance
                         : 
<HEADER> Category_Likelihood_1    : [Category_Likelihood_1_dsi] : Evaluation of Category_Likelihood_1 on testing sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00766 sec       
<HEADER> Category_Likelihood_2    : [Category_Likelihood_2_dsi] : Evaluation of Category_Likelihood_2 on testing sample (10000 events)
                         : Elapsed time for evaluation of 10000 events: 0.00755 sec       
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
Info in <TMVA::MethodCategory::GetMVaValues>: Evaluate MethodCategory for 10000 events type 1 on the dataset dataset
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
Info in <TMVA::MethodCategory::GetMVaValues>: Evaluate MethodCategory for 10000 events type 1 on the dataset dataset
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec

Evaluate and compare performance of all configured mvas

In [23]:
factory->EvaluateAllMethods();
<HEADER> Factory                  : Evaluate all methods
<HEADER> Factory                  : Evaluate classifier: Fisher
                         : 
<HEADER> Fisher                   : [dataset] : Loop over test events and fill histograms with classifier response...
                         : 
<HEADER> TFHandler_Fisher         : Variable        Mean        RMS   [        Min        Max ]
                         : -----------------------------------------------------------
                         :     var1:  -0.014081     1.2910   [    -5.3119     4.5609 ]
                         :     var2:  -0.014399     1.3299   [    -4.7537     4.6723 ]
                         :     var3:  -0.027971     1.3779   [    -5.2892     4.7007 ]
                         :     var4:    0.12966     1.4883   [    -5.1002     4.9767 ]
                         : -----------------------------------------------------------
<HEADER> Factory                  : Evaluate classifier: Likelihood
                         : 
<HEADER> Likelihood               : [dataset] : Loop over test events and fill histograms with classifier response...
                         : 
<HEADER> TFHandler_Likelihood     : Variable        Mean        RMS   [        Min        Max ]
                         : -----------------------------------------------------------
                         :     var1:  -0.014081     1.2910   [    -5.3119     4.5609 ]
                         :     var2:  -0.014399     1.3299   [    -4.7537     4.6723 ]
                         :     var3:  -0.027971     1.3779   [    -5.2892     4.7007 ]
                         :     var4:    0.12966     1.4883   [    -5.1002     4.9767 ]
                         : -----------------------------------------------------------
<HEADER> Factory                  : Evaluate classifier: FisherCat
                         : 
<HEADER> FisherCat                : [dataset] : Loop over test events and fill histograms with classifier response...
                         : 
<HEADER> TFHandler_FisherCat      : Variable        Mean        RMS   [        Min        Max ]
                         : -----------------------------------------------------------
                         :     var1:  -0.014081     1.2910   [    -5.3119     4.5609 ]
                         :     var2:  -0.014399     1.3299   [    -4.7537     4.6723 ]
                         :     var3:  -0.027971     1.3779   [    -5.2892     4.7007 ]
                         :     var4:    0.12966     1.4883   [    -5.1002     4.9767 ]
                         : -----------------------------------------------------------
<HEADER> Factory                  : Evaluate classifier: LikelihoodCat
                         : 
<HEADER> LikelihoodCat            : [dataset] : Loop over test events and fill histograms with classifier response...
                         : 
<HEADER> TFHandler_LikelihoodCat  : Variable        Mean        RMS   [        Min        Max ]
                         : -----------------------------------------------------------
                         :     var1:  -0.014081     1.2910   [    -5.3119     4.5609 ]
                         :     var2:  -0.014399     1.3299   [    -4.7537     4.6723 ]
                         :     var3:  -0.027971     1.3779   [    -5.2892     4.7007 ]
                         :     var4:    0.12966     1.4883   [    -5.1002     4.9767 ]
                         : -----------------------------------------------------------
                         : 
                         : Evaluation results ranked by best signal efficiency and purity (area)
                         : -------------------------------------------------------------------------------------------------------------------
                         : DataSet       MVA                       
                         : Name:         Method:          ROC-integ
                         : dataset       FisherCat      : 0.914
                         : dataset       LikelihoodCat  : 0.913
                         : dataset       Fisher         : 0.808
                         : dataset       Likelihood     : 0.768
                         : -------------------------------------------------------------------------------------------------------------------
                         : 
                         : Testing efficiency compared to training efficiency (overtraining check)
                         : -------------------------------------------------------------------------------------------------------------------
                         : DataSet              MVA              Signal efficiency: from test sample (from training sample) 
                         : Name:                Method:          @B=0.01             @B=0.10            @B=0.30   
                         : -------------------------------------------------------------------------------------------------------------------
                         : dataset              FisherCat      : 0.352 (0.360)       0.743 (0.739)      0.919 (0.916)
                         : dataset              LikelihoodCat  : 0.350 (0.351)       0.738 (0.736)      0.919 (0.916)
                         : dataset              Fisher         : 0.184 (0.185)       0.471 (0.486)      0.746 (0.742)
                         : dataset              Likelihood     : 0.211 (0.242)       0.446 (0.453)      0.609 (0.608)
                         : -------------------------------------------------------------------------------------------------------------------
                         : 
<HEADER> Dataset:dataset          : Created tree 'TestTree' with 10000 events
                         : 
<HEADER> Dataset:dataset          : Created tree 'TrainTree' with 10000 events
                         : 
<HEADER> Factory                  : Thank you for using TMVA!
                         : For citation information, please visit: http://tmva.sf.net/citeTMVA.html
Info in <TMVA::MethodCategory::GetMVaValues>: Evaluate MethodCategory for 10000 events type 0 on the dataset dataset
Info in <TMVA::MethodCategory::GetMVaValues>: Evaluate MethodCategory for 10000 events type 0 on the dataset dataset

Save the output

In [24]:
outputFile->Close();

std::cout << "==> Wrote root file: " << outputFile->GetName() << std::endl;
std::cout << "==> TMVAClassificationCategory is done!" << std::endl;
==> Wrote root file: TMVA.root
==> TMVAClassificationCategory is done!

Clean up

In [25]:
delete factory;
delete dataloader;

Launch the gui for the root macros

In [26]:
if (!gROOT->IsBatch()) TMVA::TMVAGui( outfileName );