Df 0 1 6_Vec Ops

Process collections in RDataFrame with the help of RVec.

This tutorial shows the potential of the VecOps approach for treating collections stored in datasets, a situation very common in HEP data analysis.

Author: Danilo Piparo (CERN)
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Thursday, June 24, 2021 at 07:14 AM.

In [1]:
using ROOT::RDataFrame;
using namespace ROOT::VecOps;

We re-create a set of points in a square. This is a technical detail, just to create a dataset to play with!

In [2]:
auto unifGen = [](double) { return gRandom->Uniform(-1.0, 1.0); };
auto vGen = [&](int len) {
   RVec<double> v(len);
   std::transform(v.begin(), v.end(), v.begin(), unifGen);
   return v;
};
RDataFrame d(1024);
auto d0 = d.Define("len", []() { return (int)gRandom->Uniform(0, 16); })
   .Define("x", vGen, {"len"})
   .Define("y", vGen, {"len"});
input_line_47:4:4: error: use of undeclared identifier 'RVec'
   RVec<double> v(len);
   ^
input_line_47:5:19: error: use of undeclared identifier 'v'
   std::transform(v.begin(), v.end(), v.begin(), unifGen);
                  ^
input_line_47:5:30: error: use of undeclared identifier 'v'
   std::transform(v.begin(), v.end(), v.begin(), unifGen);
                             ^
input_line_47:5:39: error: use of undeclared identifier 'v'
   std::transform(v.begin(), v.end(), v.begin(), unifGen);
                                      ^
input_line_47:6:11: error: use of undeclared identifier 'v'
   return v;
          ^
In module 'ROOTDataFrame':
/home/sftnight/build/workspace/root-makedoc-master/rootspi/rdoc/src/master.build/include/ROOT/RDF/RInterface.hxx:299:14: error: no matching member function for call to 'DefineImpl'
      return DefineImpl<F, RDFDetail::CustomColExtraArgs::None>(name, std::move(expression), columns, "Define");
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
input_line_47:10:5: note: in instantiation of function template specialization 'ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager, void>::Define<(lambda at input_line_47:3:13), 0>' requested here
   .Define("x", vGen, {"len"})
    ^
/home/sftnight/build/workspace/root-makedoc-master/rootspi/rdoc/src/master.build/include/ROOT/RDF/RInterface.hxx:2636:4: note: candidate template ignored: requirement 'std::is_default_constructible<void>::value' was not satisfied [with F = (lambda at input_line_47:3:13), DefineType = ROOT::Detail::RDF::CustomColExtraArgs::None, RetType = void]
   DefineImpl(std::string_view name, F &&expression, const ColumnNames_t &columns, const std::string &where)
   ^
/home/sftnight/build/workspace/root-makedoc-master/rootspi/rdoc/src/master.build/include/ROOT/RDF/RInterface.hxx:2689:4: note: candidate function template not viable: requires 3 arguments, but 4 were provided
   DefineImpl(std::string_view, F, const ColumnNames_t &)
   ^

Now we have in hands d, a rdataframe with two columns, x and y, which hold collections of coordinates. The size of these collections vary. Let's now define radii out of x and y. We'll do it treating the collections stored in the columns without looping on the individual elements.

In [3]:
auto d1 = d0.Define("r", "sqrt(x*x + y*y)");
input_line_48:2:2: error: Syntax error
 auto d1 = d0.Define("r", "sqrt(x*x + y*y)");
 ^
FunctionDecl 0x7f78e1077780 <input_line_48:1:1, line:4:1> line:1:6 __cling_Un1Qu322 'void (void *)'
|-ParmVarDecl 0x7f78e10776c8 <col:23, col:29> col:29 vpClingValue 'void *'
|-CompoundStmt 0x7f78e1077af0 <col:43, line:4:1>
| |-DeclStmt 0x7f78e1077ad0 <line:2:2, col:45>
| | `-VarDecl 0x7f78e1077860 <col:2, col:44> col:7 d1 'auto' cinit
| |   `-CallExpr 0x7f78e1077aa0 <col:12, col:44> '<dependent type>'
| |     |-CXXDependentScopeMemberExpr 0x7f78e10779d8 <col:12, col:15> '<dependent type>' lvalue .Define
| |     | `-DeclRefExpr 0x7f78e1077998 <col:12> '<dependent type>' lvalue Var 0x7f78e10778d0 'd0' '<dependent type>'
| |     |-StringLiteral 0x7f78e1077a20 <col:22> 'const char [2]' lvalue "r"
| |     `-StringLiteral 0x7f78e1077a78 <col:27> 'const char [16]' lvalue "sqrt(x*x + y*y)"
| `-NullStmt 0x7f78e1077ae8 <line:3:1>
`-AnnotateAttr 0x7f78e1077938 <<invalid sloc>> R"ATTRDUMP(__ResolveAtRuntime)ATTRDUMP"
<<<NULL>>>

Now we want to plot 2 quarters of a ring with radii .5 and 1 Note how the cuts are performed on RVecs, comparing them with integers and among themselves

In [4]:
auto ring_h = d1.Define("rInFig", "r > .4 && r < .8 && x*y < 0")
                 .Define("yFig", "y[rInFig]")
                 .Define("xFig", "x[rInFig]")
                 .Histo2D({"fig", "Two quarters of a ring", 64, -1, 1, 64, -1, 1}, "xFig", "yFig");

auto cring = new TCanvas();
ring_h->DrawCopy("Colz");

return 0;
input_line_49:2:2: error: Syntax error
 auto ring_h = d1.Define("rInFig", "r > .4 && r < .8 && x*y < 0")
 ^
FunctionDecl 0x7f78e1df2808 <input_line_49:1:1, line:12:1> line:1:6 __cling_Un1Qu323 'void (void *)'
|-ParmVarDecl 0x7f78e1df2750 <col:23, col:29> col:29 vpClingValue 'void *'
|-CompoundStmt 0x7f78e1dfb408 <col:43, line:12:1>
| |-DeclStmt 0x7f78e1df2fb0 <line:2:2, line:5:99>
| | `-VarDecl 0x7f78e1df28e8 <line:2:2, line:5:98> line:2:7 used ring_h 'auto' cinit
| |   `-CallExpr 0x7f78e1df2f78 <col:16, line:5:98> '<dependent type>'
| |     |-CXXDependentScopeMemberExpr 0x7f78e1df2d30 <line:2:16, line:5:19> '<dependent type>' lvalue .Histo2D
| |     | `-CallExpr 0x7f78e1df2d00 <line:2:16, line:4:45> '<dependent type>'
| |     |   |-CXXDependentScopeMemberExpr 0x7f78e1df2c70 <line:2:16, line:4:19> '<dependent type>' lvalue .Define
| |     |   | `-CallExpr 0x7f78e1df2c40 <line:2:16, line:3:45> '<dependent type>'
| |     |   |   |-CXXDependentScopeMemberExpr 0x7f78e1df2bb0 <line:2:16, line:3:19> '<dependent type>' lvalue .Define
| |     |   |   | `-CallExpr 0x7f78e1df2b80 <line:2:16, col:65> '<dependent type>'
| |     |   |   |   |-CXXDependentScopeMemberExpr 0x7f78e1df2a60 <col:16, col:19> '<dependent type>' lvalue .Define
| |     |   |   |   | `-DeclRefExpr 0x7f78e1df2a20 <col:16> '<dependent type>' lvalue Var 0x7f78e1df2958 'd1' '<dependent type>'
| |     |   |   |   |-StringLiteral 0x7f78e1df2aa8 <col:26> 'const char [7]' lvalue "rInFig"
| |     |   |   |   `-StringLiteral 0x7f78e1df2b48 <col:36> 'const char [28]' lvalue "r > .4 && r < .8 && x*y < 0"
| |     |   |   |-StringLiteral 0x7f78e1df2bf8 <line:3:26> 'const char [5]' lvalue "yFig"
| |     |   |   `-StringLiteral 0x7f78e1df2c18 <col:34> 'const char [10]' lvalue "y[rInFig]"
| |     |   |-StringLiteral 0x7f78e1df2cb8 <line:4:26> 'const char [5]' lvalue "xFig"
| |     |   `-StringLiteral 0x7f78e1df2cd8 <col:34> 'const char [10]' lvalue "x[rInFig]"
| |     |-InitListExpr 0x7f78e1df2eb8 <line:5:27, col:81> 'void'
| |     | |-StringLiteral 0x7f78e1df2d78 <col:28> 'const char [4]' lvalue "fig"
| |     | |-StringLiteral 0x7f78e1df2d98 <col:35> 'const char [23]' lvalue "Two quarters of a ring"
| |     | |-IntegerLiteral 0x7f78e1df2dc8 <col:61> 'int' 64
| |     | |-UnaryOperator 0x7f78e1df2e08 <col:65, col:66> 'int' prefix '-'
| |     | | `-IntegerLiteral 0x7f78e1df2de8 <col:66> 'int' 1
| |     | |-IntegerLiteral 0x7f78e1df2e20 <col:69> 'int' 1
| |     | |-IntegerLiteral 0x7f78e1df2e40 <col:72> 'int' 64
| |     | |-UnaryOperator 0x7f78e1df2e80 <col:76, col:77> 'int' prefix '-'
| |     | | `-IntegerLiteral 0x7f78e1df2e60 <col:77> 'int' 1
| |     | `-IntegerLiteral 0x7f78e1df2e98 <col:80> 'int' 1
| |     |-StringLiteral 0x7f78e1df2f38 <col:84> 'const char [5]' lvalue "xFig"
| |     `-StringLiteral 0x7f78e1df2f58 <col:92> 'const char [5]' lvalue "yFig"
| |-DeclStmt 0x7f78e1dfb2d0 <line:7:1, col:27>
| | `-VarDecl 0x7f78e1df2fe0 <col:1, col:26> col:6 cring 'TCanvas *':'TCanvas *' cinit
| |   `-CXXNewExpr 0x7f78e1dfb1e8 <col:14, col:26> 'TCanvas *' CXXMethod 0x5f781b8 'operator new' 'void *(size_t)'
| |     `-CXXConstructExpr 0x7f78e1dfaf00 <col:18, col:26> 'TCanvas' 'void (Bool_t)'
| |       `-CXXDefaultArgExpr 0x7f78e1dfaee0 <<invalid sloc>> 'Bool_t':'bool'
| |-CallExpr 0x7f78e1dfb390 <line:8:1, col:24> '<dependent type>'
| | |-CXXDependentScopeMemberExpr 0x7f78e1dfb328 <col:1, col:9> '<dependent type>' lvalue ->DrawCopy
| | | `-DeclRefExpr 0x7f78e1dfb2e8 <col:1> 'auto' lvalue Var 0x7f78e1df28e8 'ring_h' 'auto'
| | `-StringLiteral 0x7f78e1dfb370 <col:18> 'const char [5]' lvalue "Colz"
| |-ReturnStmt 0x7f78e1dfb3f0 <line:10:1, col:8>
| | `-ImplicitCastExpr 0x7f78e1dfb3d8 <col:8> 'void' <ToVoid>
| |   `-IntegerLiteral 0x7f78e1dfb3b8 <col:8> 'int' 0
| `-NullStmt 0x7f78e1dfb400 <line:11:1>
`-AnnotateAttr 0x7f78e1df29c0 <<invalid sloc>> R"ATTRDUMP(__ResolveAtRuntime)ATTRDUMP"
<<<NULL>>>

Draw all canvases

In [5]:
gROOT->GetListOfCanvases()->Draw()