This is the first part of a tutorial describing the Iteration/Expression Tree (IET), an intermediate representation based on Abstract Syntax Trees (AST) used in the Devito compiler. In this part we will show how to access and navigate the IET rooted in an Operator
.
The reader of this tutorial is expected to be familiar with the basic Devito API (i.e., Grid
, Function
/TimeFunction
, Operator
, ...). Otherwise, the CFD tutorials in examples/cfd
are a better place to start.
Let's start with a fairly trivial example.
from devito import Eq, Grid, TimeFunction, Operator
grid = Grid(shape=(3, 3))
u = TimeFunction(name='u', grid=grid)
u.data
Data([[[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]], [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]]], dtype=float32)
Here, we just defined a time-varying grid with 3x3 points in each of the space Dimension
s x and y. u.data
gives us access to the actual data values. In particular, u.data[0, :, :]
holds the data values at timestep t=0
, whereas u.data[1, :, :]
holds the data values at timestep t=1
.
We now create an Operator
that increments by 1 all points in the computational domain.
Note: User should not change to configuration['openmp'] = 1
as the IET structure changes with OpenMP enabled.
from devito import configuration
configuration['openmp'] = 0 # We don't care about OpenMP in this example
eq = Eq(u.forward, u+1)
op = Operator(eq)
An Operator
is an IET node that can generate, JIT-compile, and run low-level code (e.g., C). Just like all other types of IET nodes, it's got a number of metadata attached. For example, we can query an Operator
to retrieve the user-provided SymPy expressions.
op.args['expressions']
Eq(u(t + dt, x, y), u(t, x, y) + 1)
If we print op
, we can see how the generated code looks like.
print(op)
#define _POSIX_C_SOURCE 200809L #include "stdlib.h" #include "math.h" #include "sys/time.h" #include "xmmintrin.h" #include "pmmintrin.h" struct profiler { double section0; } ; int Kernel(float *restrict u_vec, const int time_M, const int time_m, struct profiler* timers, const int x_M, const int x_m, const int x_size, const int y_M, const int y_m, const int y_size) { float (*restrict u)[x_size + 1 + 1][y_size + 1 + 1] __attribute__((aligned(64))) = (float (*)[x_size + 1 + 1][y_size + 1 + 1]) u_vec; /* Flush denormal numbers to zero in hardware */ _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON); _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2)) { struct timeval start_section0, end_section0; gettimeofday(&start_section0, NULL); for (int x = x_m; x <= x_M; x += 1) { #pragma omp simd for (int y = y_m; y <= y_M; y += 1) { u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1; } } gettimeofday(&end_section0, NULL); timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000; } return 0; }
As shown in other tutorials, we can JIT-compile and run op
by executing:
op.apply(time=2)
u.data
Operator `Kernel` run in 0.00 s
Data([[[2., 2., 2.], [2., 2., 2.], [2., 2., 2.]], [[3., 3., 3.], [3., 3., 3.], [3., 3., 3.]]], dtype=float32)
This is the first time we are calling apply
on op
, so the Operator
code gets written in a .c
file and compiled into a library (a .so
shared object if on Linux).
op
runs with the user-provided time_M=2
and with the default arguments u=u, x_m=0, x_M=2, y_m=0, y_M=2,
and time_m=0
. The indices *_m
and *_M
represent the minimum and maximum iteration points along Dimension
*
.
op
can be used for a completely different run, for example providing a new TimeFunction
.
u2 = TimeFunction(name='u', grid=grid)
u2.data
Data([[[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]], [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]]], dtype=float32)
Any of the Operator
default arguments may be replaced by passing suitable key-value parameters.
op.apply(u=u2, x_m=1, x_M=2, y_m=0, y_M=1, time_M=3)
u2.data
Operator `Kernel` run in 0.00 s
Data([[[0., 0., 0.], [4., 4., 0.], [4., 4., 0.]], [[0., 0., 0.], [3., 3., 0.], [3., 3., 0.]]], dtype=float32)
Note, however, that there is no need for recompilation. JIT compilation occurs only once, triggered by the first call to op.apply()
.
An Operator
is the root of an IET that typically consists of several nested Iteration
s and Expression
s – two other fundamental IET node types. The user-provided SymPy equations are wrapped within Expressions
. Loop nest embedding such expressions are constructed by suitably nesting Iterations
.
The Devito compiler constructs the IET from a collection of Cluster
s, which represent a higher-level intermediate representation (not covered in this tutorial).
The Devito compiler also analyses the IET to determine key computational properties, such as sequential, parallel, and vectorizable. These properties are attached directly to the relevant IET nodes.
We can print the IET structure of an Operator
, as well as the attached computational properties, using the utility function pprint
.
from devito import pprint
pprint(op)
<Callable Kernel> <List (0, 2, 0)> <ArrayCast> <List (0, 2, 0)> <DenormalsMacro> <Element /* Flush denormal numbers to zero in hardware */> <Element _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);> <Element _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);> <List (0, 1, 0)> <[affine,sequential,wrappable] Iteration time::time::(time_m, time_M, 1)::(0, 0)> <TimedList (2, 1, 2)> <C.Statement struct timeval start_section0, end_section0;> <C.Statement gettimeofday(&start_section0, NULL);> <Section (1)> <[affine,parallel] Iteration x::x::(x_m, x_M, 1)::(0, 0)> <[affine,parallel,vector-dim] Iteration y::y::(y_m, y_M, 1)::(0, 0)> <ExpressionBundle (1)> <Expression u[t1, x + 1, y + 1] = u[t0, x + 1, y + 1] + 1> <C.Statement gettimeofday(&end_section0, NULL);> <C.Statement timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000;>
In this example, op
is represented as a <Callable Kernel>
. Attached to it are metadata, such as _headers
and _includes
, as well as a body
, which includes the children IET nodes. Here, the body is a List
object.
op._headers
['#define _POSIX_C_SOURCE 200809L']
op._includes
['stdlib.h', 'math.h', 'sys/time.h', 'xmmintrin.h', 'pmmintrin.h']
op.body
(<List (0, 2, 0)>,)
We can inspect the List
to observe that its immediate children are an <ArrayCast>
and another <List>
. We can then proceed with the IET traversal until we locate the user-provided SymPy
equations.
print(op.body[0].body[0]) # Printing the ArrayCast
float (*restrict u)[x_size + 1 + 1][y_size + 1 + 1] __attribute__((aligned(64))) = (float (*)[x_size + 1 + 1][y_size + 1 + 1]) u_vec;
print(op.body[0].body[1]) # Printing the List
/* Flush denormal numbers to zero in hardware */ _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON); _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2)) { struct timeval start_section0, end_section0; gettimeofday(&start_section0, NULL); for (int x = x_m; x <= x_M; x += 1) { #pragma omp simd for (int y = y_m; y <= y_M; y += 1) { u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1; } } gettimeofday(&end_section0, NULL); timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000; }
Below we access the Iteration
representing the time loop.
t_iter = op.body[0].body[1].body[1].body[0]
t_iter
<WithProperties[affine,sequential,wrappable]::Iteration time[t0,t1]; (time_m, time_M, 1)>
print(t_iter)
for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2)) { struct timeval start_section0, end_section0; gettimeofday(&start_section0, NULL); for (int x = x_m; x <= x_M; x += 1) { #pragma omp simd for (int y = y_m; y <= y_M; y += 1) { u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1; } } gettimeofday(&end_section0, NULL); timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000; }
We can inspect the Iteration
to discover what its iteration bounds are.
t_iter.limits
(time_m, time_M, 1)
And as we keep going down through the IET, we can eventually reach the Expression
wrapping the user-provided SymPy equation.
expr = t_iter.nodes[0].body[0].body[0].nodes[0].nodes[0].body[0]
expr.view
'<Expression u[t1, x + 1, y + 1] = u[t0, x + 1, y + 1] + 1>'
Of course, there are mechanisms in place to, for example, find all Expression
s in a given IET. The Devito compiler has a number of IET visitors, among which FindNodes
, usable to retrieve all nodes of a particular type. So we easily
can get all Expression
s within op
as follows
from devito.ir.iet import Expression, FindNodes
exprs = FindNodes(Expression).visit(op)
exprs[0].view
'<Expression u[t1, x + 1, y + 1] = u[t0, x + 1, y + 1] + 1>'