! Read the instructions first !
In this exercise we will learn how to use Python-programming to plot a histogram and analyse it. We use a dataset collected by the CMS detector in 2011 . The data consists of such events, where two muons were created in a collision. We want to plot a histogram of the invariant mass of the two muons created in the collision. Events with specific criteria  have been selected in the CSV file Ymumu_Run2011A.csv which we are using.
Explore the different code cells below and run the codes by clicking the code cell active and pressing the Run-button. In the code cells the lines which begin with #-character are comment lines, which will tell you what the actual code does. Other lines are code lines. To gain a better understanding of the code, read the comment lines carefully.
At first you should run the demonstration cells through. After that you can try to write code yourself and plot a histogram from a different dataset.
Note that you can also modify text cells (Markdown cells) by double-clicking them. Text cells support markdown and HTML text. After you are done modifying a text cell, you can run it just like a code cell.
 CMS collaboration (2016). DoubleMu primary dataset in AOD format from RunA of 2011 (/DoubleMu/Run2011A-12Oct2013-v1/AOD). CERN Open Data Portal. DOI: 10.7483/OPENDATA.CMS.RZ34.QR6N.
 Thomas McCauley (2016). Ymumu. Jupyter Notebook file. https://github.com/tpmccauley/cmsopendata-jupyter/blob/hst-0.1/Ymumu.ipynb.
# First we need to import the modules needed to do data-analysis using Python. # Pandas-module is needed to read the csv-file. # Matplotlib.pylab-module is needed to make plots. # Modules are named as pd and plt so that we don't have to write the whole name in the future. import pandas as pd import matplotlib.pylab as plt
# Read the file "Ymumu_Run2011A.csv" and save the content to a variable called "dataset" dataset = pd.read_csv("https://raw.githubusercontent.com/cms-opendata-education/cms-jupyter-materials-english/master/Data/Ymumu_Run2011A.csv") # Let's see what our datafile contains. You can read more lines by inserting a number in # parenthesis. The default is 5 lines. dataset.head()
# We want to make a histogram of the invariant masses, which can be found in the column "M". # Let's save the column "M" from the "dataset" to the variable "invariant_mass". invariant_mass = dataset['M'] # Furthermore let's see how many values of invariant masses we have in the variable "invariant_mass" print(len(invariant_mass))
Now we can create and plot the histogram from the values of the invariant masses that we got. Histogram consists of multiple bins. Each bin shows how many events occured in the corresponding range of invariant mass. Note that we will use a total of 500 bins in the histogram, so it is hard to see separate bins because there are so many of them. You can also change the number of bins and see how the figure looks like.
# Plot the histogram with the function hist() of the matplotlib.pylab-module # "bins" determines the number of the bins used in the histogram. plt.hist(invariant_mass, bins=500) # Name the axes and give the title. plt.xlabel('Invariant mass [GeV]') plt.ylabel('Number of events') plt.title('The histogram of the invariant masses of two muons') # Show the plot. plt.show()
The following datafile contains data from similar events than the one we already plotted, but it has a wider range of invariant masses Use this datafile to make a similar histogram than in the demonstration.
You can add the file URL directly in to the pd.read("filepath")</span> -function.
plt.hist(variable_to_plot, bins=number, range=(low limit, high limit))</span>
You can for example write plt.hist(variable_to_plot, bins=number, range=(2.5, 3.5))</span> to zoom for the values of invariant mass between 2.5 GeV and 3.5 Gev to see the $J/\Psi$ partcle.