import ROOT
from ROOT import TFile, TMVA, TCut
%jsmva on
Welcome to JupyROOT 6.09/01
For more details please see this notebook.
outputFile = TFile( "TMVA.root", 'RECREATE' )
TMVA.Tools.Instance()
factory = TMVA.Factory(JobName="TMVAClassification", TargetFile=outputFile,
V=False, Color=True, DrawProgressBar=True, Transformations=["I", "D", "P", "G","D"],
AnalysisType="Classification")
dataset = "tmva_class_example"
loader = TMVA.DataLoader(dataset)
loader.AddVariable( "myvar1 := var1+var2", 'F' )
loader.AddVariable( "myvar2 := var1-var2", "Expression 2", 'F' )
loader.AddVariable( "var3", "Variable 3", 'F' )
loader.AddVariable( "var4", "Variable 4", 'F' )
loader.AddSpectator( "spec1:=var1*2", "Spectator 1", 'F' )
loader.AddSpectator( "spec2:=var1*3", "Spectator 2", 'F' )
if ROOT.gSystem.AccessPathName( "./tmva_class_example.root" ) != 0:
ROOT.gSystem.Exec( "wget https://root.cern.ch/files/tmva_class_example.root")
input = TFile.Open( "./tmva_class_example.root" )
# Get the signal and background trees for training
signal = input.Get( "TreeS" )
background = input.Get( "TreeB" )
# Global event weights (see below for setting event-wise weights)
signalWeight = 1.0
backgroundWeight = 1.0
mycuts = TCut("")
mycutb = TCut("")
loader.AddSignalTree(signal, signalWeight)
loader.AddBackgroundTree(background, backgroundWeight)
loader.fSignalWeight = signalWeight
loader.fBackgroundWeight = backgroundWeight
loader.fTreeS = signal
loader.fTreeB = background
loader.PrepareTrainingAndTestTree(SigCut=mycuts, BkgCut=mycutb,
nTrain_Signal=1000, nTrain_Background=1000, nTest_Signal=2000, nTest_Background=2000,
SplitMode="Random", NormMode="NumEvents", V=False)
DataSetInfo |
| |||
Add Tree TreeS of type Signal with 6000 events | ||||
DataSetInfo |
| |||
Add Tree TreeB of type Background with 6000 events |
For more details please see this notebook.
factory.BookMethod( DataLoader=loader, Method=TMVA.Types.kMLP, MethodTitle="MLP",
H=False, V=False, NeuronType="tanh", VarTransform="N", NCycles=600, HiddenLayers="N+5",
TestRate=5, UseRegulator=False )
trainingStrategy = [{
"LearningRate": 1e-1,
"Momentum": 0.0,
"Repetitions": 1,
"ConvergenceSteps": 300,
"BatchSize": 20,
"TestRepetitions": 15,
"WeightDecay": 0.001,
"Regularization": "NONE",
"DropConfig": "0.0+0.5+0.5+0.5",
"DropRepetitions": 1,
"Multithreading": True
}, {
"LearningRate": 1e-2,
"Momentum": 0.5,
"Repetitions": 1,
"ConvergenceSteps": 300,
"BatchSize": 30,
"TestRepetitions": 7,
"WeightDecay": 0.001,
"Regularization": "L2",
"DropConfig": "0.0+0.1+0.1+0.1",
"DropRepetitions": 1,
"Multithreading": True
}, {
"LearningRate": 1e-2,
"Momentum": 0.3,
"Repetitions": 1,
"ConvergenceSteps": 300,
"BatchSize": 40,
"TestRepetitions": 7,
"WeightDecay": 0.001,
"Regularization": "L2",
"Multithreading": True
},{
"LearningRate": 1e-3,
"Momentum": 0.1,
"Repetitions": 1,
"ConvergenceSteps": 200,
"BatchSize": 70,
"TestRepetitions": 7,
"WeightDecay": 0.001,
"Regularization": "NONE",
"Multithreading": True
}]
factory.BookMethod(DataLoader=loader, Method=TMVA.Types.kDNN, MethodTitle="DNN",
H = False, V=False, VarTransform="Normalize", ErrorStrategy="CROSSENTROPY",
Layout=["TANH|100", "TANH|50", "TANH|10", "LINEAR"],
TrainingStrategy=trainingStrategy, Architecture="CPU")
factory.BookMethod(DataLoader= loader, Method=TMVA.Types.kBDT, MethodTitle="BDT",
H=False,V=False,NTrees=850,MinNodeSize="2.5%",MaxDepth=3,BoostType="AdaBoost", AdaBoostBeta=0.5,
UseBaggedBoost=True,BaggedSampleFraction=0.5, SeparationType="GiniIndex", nCuts=20 )
<ROOT.TMVA::MethodBDT object ("BDT") at 0x39e6080>
Factory | Booking method: MLP | ||||||||||||||||||||||
MLP |
| ||||||||||||||||||||||
Transformation, Variable selection : | |||||||||||||||||||||||
Input : variable 'myvar1' <---> Output : variable 'myvar1' | |||||||||||||||||||||||
Input : variable 'myvar2' <---> Output : variable 'myvar2' | |||||||||||||||||||||||
Input : variable 'var3' <---> Output : variable 'var3' | |||||||||||||||||||||||
Input : variable 'var4' <---> Output : variable 'var4' | |||||||||||||||||||||||
MLP | Building Network. | ||||||||||||||||||||||
Initializing weights | |||||||||||||||||||||||
Factory | Booking method: DNN | ||||||||||||||||||||||
DNN |
| ||||||||||||||||||||||
Transformation, Variable selection : | |||||||||||||||||||||||
Input : variable 'myvar1' <---> Output : variable 'myvar1' | |||||||||||||||||||||||
Input : variable 'myvar2' <---> Output : variable 'myvar2' | |||||||||||||||||||||||
Input : variable 'var3' <---> Output : variable 'var3' | |||||||||||||||||||||||
Input : variable 'var4' <---> Output : variable 'var4' | |||||||||||||||||||||||
Factory | Booking method: BDT | ||||||||||||||||||||||
DataSetFactory |
| ||||||||||||||||||||||
| |||||||||||||||||||||||
DataSetInfo | Correlation matrix (Signal) | ||||||||||||||||||||||
DataSetInfo | Correlation matrix (Background) | ||||||||||||||||||||||
DataSetFactory |
|
If we trained a neural network then the weights of the network will be saved to XML and C file. We can read back the XML file and we can visualize the network using Factory.DrawNeuralNetwork function.
The arguments of this function:
Keyword | Can be used as positional argument | Default | Predefined values | Description |
---|---|---|---|---|
datasetName | yes, 1. | - | - | The name of dataset |
methodName | yes, 2. | - | - | The name of method |
This visualization will be interactive, and we can do the following with it:
The synapses are drawn with 2 colors, one for positive weight and one for negative weight. The absolute value of the synapses are scaled and transformed to thickness of line between to node.
factory.DrawNeuralNetwork(dataset, "MLP")
The DrawNeuralNetwork function also can visualize deep neural networks, we just have to pass "DNN" as method name. If you have very big network with lots of thousands of neurons then drawing the network will be a little bit slow and will need a lot of ram, so be careful with this function.
This visualization also will be interactive, and we can do the following with it:
factory.DrawNeuralNetwork(dataset, "DNN")
The trained decision trees will be save to XML save too, so we can read back the XML file and we can visualize the trees. This is the purpose of Factory.DrawDecisionTree function.
The arguments of this function:
Keyword | Can be used as positional argument | Default | Predefined values | Description |
---|---|---|---|---|
datasetName | yes, 1. | - | - | The name of dataset |
methodName | yes, 2. | - | - | The name of method |
This function will produce a little box where you can enter the index of the tree (the number of trees will be also will appear before this input box) you want to see. After choosing this number you have to press the Draw button. The nodes of tree will be colored, the color is associated to signal efficiency.
The visualization of tree will be interactive and you can do the following with it:
Mouseover (node, weight): showing decision path
Zooming and grab and move supported
Reset zoomed tree: double click
Expand all closed subtrees, turn off zoom: button in the bottom of the picture
Click on node:
factory.DrawDNNWeights(dataset, "DNN")
factory.DrawDecisionTree(dataset, "BDT") #11
outputFile.Close()