An introductional notebook to HEP analysis in Python

# # #

In this notebook you can find an easy set of commands that show some basic computing techniques commonly used in High Energy Physics (HEP) analyzes.

# #

It also shows how to create an histogram, fill it and draw it. Moreover it is an introduction to ROOT too. The final output is a plot with the number of leptons.

# #

Check the description of the varibles inside the dataset at the end of this notebook

# #

All done with less that 15 lines of code!

An introduction to the ATLAS public datasets

This is a notebook using the ROOT Prompt kernel that using c++ language, is intended to show the internal content and the way to call and interact with the datasets released by the ATLAS experiment with focus in Education and Training activities:

# #
# #

We use data recorded (simulated) by the ATLAS detector (experiment)

# #

where physics objects can be represented as below

ATLAS azimuthal view with physics objects

Cell #1

First of all ROOT is imported to read the files in the _.root_ data format. A _.root_ file consists of a tree having branches and leaves.

At this point you could also import further programs that contain other formulas that you maybe use more often. But here we don't import other programs to keep it simple.

# In[1]: import ROOT #

Cell #2

In order to activate the interactive visualisation of the histogram that is later created we can use the JSROOT magic:

# In[2]: get_ipython().run_line_magic('jsroot', 'on') #

Cell #3

Next we have to open the data that we want to analyze. It is stored in a _*.root_ file that consists of a tree having branches and leaves.

As you can see, we are reading the data directly from the source! but you can read the file locally too.

# In[3]: f = ROOT.TFile.Open("http://opendata.atlas.cern/release/samples/MC/mc_147770.Zee.root") ## f = ROOT.TFile.Open("mc_105986.ZZ.root") #

Cell #4

The next step is to define a tree named (we called _tree_) to get the data out of the _*.root_ file, that is into a tree called _mini_:

# In[4]: tree = f.Get("mini") #

Cell #5

After the data is opened we create a canvas on which we can draw a histogram. If we do not have a canvas we cannot see our histogram at the end. Its name is _Canvas_ and its header is _a first way to plot a variable_. The two following arguments define the width and the height of the canvas.

# In[5]: canvas = ROOT.TCanvas("Canvas","a first way to plot a variable",800,600) #

Cell #6

Now we define a histogram that will later be placed on this canvas. Its name is _variable_ and the header of the histogram is _Example plot: Number of leptons_. The three following arguments indicate that this histogram contains 4 so called bins which have a range from 0 to 4.

# In[6]: hist = ROOT.TH1F("variable","Example plot: Number of leptons",4,0,4) #

Cell #7

The following lines are a loop that goes over the data that is stored in the tree and fills the histogram _h_ that we already defined. In this first notebook we don't do any cuts to keep it simple. Accordingly the loop fills the histogram for each event stored in the tree. After the program has looped over all the data it prints the word

# __Done!__. # In[7]: for event in tree: hist.Fill(tree.lep_n) print "Done!" #

Cell #8

After filling the histogram we want to see the results of the analysis. First we draw the histogram on the canvas and then the canvas on which the histogram lies:

# In[8]: hist.Draw() canvas.Draw() #

Cell #9

...

# In[9]: scale = hist.Integral() hist.Scale(1/scale) #

Cell #10

...

# In[10]: hist.Draw() canvas.Draw() #

Cell #11

Description of the Variables inside the _mini_ tree in the ATLAS Open Data samples

# # | # variable | branchname | type | description | # | :-----: | ------------- | :-------------: | :-----: | # | 01 | runNumber | int | runNumber | # | 02 | eventNumber | int | eventNumber | # | 03 | channelNumber | int | channelNumber | # | 04 | lbNumber | int | lbNumber | # | 05 | rndRunNumber | int | randomized run number mimicking run number distribution in data | # | 06 | mu | float | average interactions per bunch crossing | # | 07 | mcWeight | float | weight of an MC event | # | 08 | pvxp_n | int | number of primary vertices | # | 09 | isGoodEvent | int | summary of diverse quality flags like hfor | # | 10 | scaleFactor | float | overall scale factor for the preselected event | # | 11 | trigE | bool | boolean whether a standard trigger has fired in the egamma stream | # | 12 | trigM | bool | boolean whether a standard trigger has fired in the muon stream | # | 13 | passGRL | bool | signifies whether event passes the GRL may be put in isGoodEvent | # | 14 | hasGoodVertex | bool | signifies whether the event has at least one good vertex | # | 15 | lep_n | int | number of preselected leptons | # | 16 | lep_truthMatched | vector | boolean indicating whether the lepton is matched to a truth lepton | # | 17 | lep_trigMatched | vector | boolean signifying whether the lepton is the one triggering the event | # | 18 | lep_pt | vector | transverse momentum of the lepton | # | 19 | lep_eta | vector | pseudo-rapidity of the lepton | # | 20 | lep_phi | vector | azimuthal angle of the lepton | # | 21 | lep_E | vector | energy of the lepton | # | 22 | lep_z0 | vector | z-coordinate of the track associated to the lepton wrt. the primary vertex | # | 23 | lep_charge | vector | charge of the lepton | # | 24 | lep_isTight | vector | boolean indicating whether the lepton is of tight quality | # | 25 | lep_flag | vector | bitmask implementing object cuts of the top group | # | 26 | lep_type | vector | number signifying the lepton type (e, mu, tau) of the lepton | # | 27 | lep_ptcone30 | vector | ptcone30 isolation for the lepton | # | 28 | lep_etcone20 | vector | etcone20 isolation for the lepton | # | 28 | lep_trackd0pvunbiased | vector | d0 of the track associated to the lepton at the point of closest approach (p.o.a.) | # | 29 | lep_tracksigd0pvunbiased | vector | d0 signifcance of the track associated to the lepton at the p.o.a. | # | 30 | met_et | float | Transverse energy of the missing momentum vector | # | 31 | met_phi | float | Azimuthal angle of the missing momentum vector | # | 32 | jet_n | int | number of selected jets | # | 33 | jet_pt | vector | transverse momentum of the jet | # | 34 | jet_eta | vector | pseudorapidity of the jet | # | 35 | jet_phi | vector | azimuthal angle of the jet | # | 36 | jet_E | vector | energy of the jet | # | 37 | jet_m | vector | invariant mass of the jet | # | 38 | jet_jvf | vector | JetVertexFraction of the jet | # | 39 | jet_flag | vector | bitmask implementing object cuts of the top group | # | 40 | jet_trueflav | vector | true flavor of the jet | # | 41 | jet_truthMatched | vector | information whether the jet matches a jet on truth level | # | 42 | jet_SV0 | vector | SV0 weight of the jet | # | 43 | jet_MV1 | vector | MV1 weight of the jet | # | 44 | scaleFactor_BTAG | float | scalefactor for btagging | # | 45 | scaleFactor_ELE | float | scalefactor for electron efficiency | # | 46 | scaleFactor_JVFSF | float | scalefactor for jet vertex fraction | # | 47 | scaleFactor_MUON | float | scalefactor for muon efficiency | # | 48 | scaleFactor_PILEUP | float | scalefactor for pileup reweighting | # | 49 | scaleFactor_TRIGGER | float | scalefactor for trigger | # | 50 | scaleFactor_ZVERTEX | float | scalefactor for z-vertex reweighting | #