Df 0 0 6_Ranges

This tutorial shows how to express the concept of ranges when working with the RDataFrame.

Author: Danilo Piparo
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Sunday, January 19, 2020 at 12:50 AM.

In [1]:
import ROOT

def fill_tree(treeName, fileName):
    tdf = ROOT.ROOT.RDataFrame(100)
    tdf.Define("b1", "(int) tdfentry_")\
       .Define("b2", "(float) tdfentry_ * tdfentry_").Snapshot(treeName, fileName)
Welcome to JupyROOT 6.19/01

We prepare an input tree to run on

In [2]:
fileName = "df006_ranges_py.root"
treeName = "myTree"

fill_tree(treeName, fileName)

We read the tree from the file and create a RDataFrame.

In [3]:
d = RDF(treeName, fileName)

Usage of ranges

Now we'll count some entries using ranges

In [4]:
c_all = d.Count()

This is how you can express a range of the first 30 entries

In [5]:
d_0_30 = d.Range(0, 30)
c_0_30 = d_0_30.Count()

This is how you pick all entries from 15 onwards

In [6]:
d_15_end = d.Range(15, 0)
c_15_end = d_15_end.Count()

We can use a stride too, in this case we pick an event every 3

In [7]:
d_15_end_3 = d.Range(15, 0, 3)
c_15_end_3 = d_15_end_3.Count()

The Range is a 1st class citizen in the RDataFrame graph: not only actions (like Count) but also filters and new columns can be added to it.

In [8]:
d_0_50 = d.Range(0, 50)
c_0_50_odd_b1 = d_0_50.Filter("1 == b1 % 2").Count()

An important thing to notice is that the counts of a filter are relative to the number of entries a filter "sees". Therefore, if a Range depends on a filter, the Range will act on the entries passing the filter only.

In [9]:
c_0_3_after_even_b1 = d.Filter("0 == b1 % 2").Range(0, 3).Count()

Ok, time to wrap up: let's print all counts!

In [10]:
print("Usage of ranges:")
print(" - All entries:", c_all.GetValue())
print(" - Entries from 0 to 30:", c_0_30.GetValue())
print(" - Entries from 15 onwards:", c_15_end.GetValue())
print(" - Entries from 15 onwards in steps of 3:", c_15_end_3.GetValue())
print(" - Entries from 0 to 50, odd only:", c_0_50_odd_b1.GetValue())
print(" - First three entries of all even entries:", c_0_3_after_even_b1.GetValue())
Usage of ranges:
(' - All entries:', 100L)
(' - Entries from 0 to 30:', 30L)
(' - Entries from 15 onwards:', 85L)
(' - Entries from 15 onwards in steps of 3:', 28L)
(' - Entries from 0 to 50, odd only:', 25L)
(' - First three entries of all even entries:', 3L)