We will now revisit the fhe first example of this tutorial with an example that is better suited to the numerical scheme used in Devito. As a reminder, the governing equation is:
$$\frac{\partial u}{\partial t}+c\frac{\partial u}{\partial x} + c\frac{\partial u}{\partial y} = 0$$We then discretized this using forward differences in time and backward differences in space:
$$u_{i,j}^{n+1} = u_{i,j}^n-c \frac{\Delta t}{\Delta x}(u_{i,j}^n-u_{i-1,j}^n)-c \frac{\Delta t}{\Delta y}(u_{i,j}^n-u_{i,j-1}^n)$$In the previous example, the system was initialised with a hat function. As easy as this example seems, it actually highlights a few limitations of finite differences and related methods:
In the remainder of this example, we will reproduce the results from the previous example, only this time with a smooth initial condition. This lets us observe Devito in a setting for which it is better equipped.
from examples.cfd import plot_field, init_hat, init_smooth
import numpy as np
%matplotlib inline
# Some variable declarations
nx = 81
ny = 81
nt = 100
c = 1.
dx = 2. / (nx - 1)
dy = 2. / (ny - 1)
sigma = .2
dt = sigma * dx
Let us now initialise the field with an infinitely smooth bump, as given by [J.A. Krakos (2012): Unsteady Adjoint Analysis for Output Sensitivity and Mesh Adaptation, PhD thesis, p. 68] as $$ f(r)= \begin{cases} \frac{1}{A}e^{-1/(r-r^2)} &\text{ for } 0 < r < 1,\\ 0 &\text{ else.} \end{cases} $$ We use this with $A=100$, and define the initial condition in two dimensions as $$u^0(x,y)=1+f\left(\frac{2}{3}x\right)*f\left(\frac{2}{3}y\right).$$
#NBVAL_IGNORE_OUTPUT
# Create field and assign initial conditions
u = np.empty((nx, ny))
init_smooth(field=u, dx=dx, dy=dy)
# Plot initial condition
plot_field(u,zmax=4)
Solving this will again move the bump.
# Repeat initialisation, so we can re-run the cell
init_smooth(field=u, dx=dx, dy=dy)
for n in range(nt + 1):
# Copy previous result into a new buffer
un = u.copy()
# Update the new result with a 3-point stencil
u[1:, 1:] = (un[1:, 1:] - (c * dt / dx * (un[1:, 1:] - un[1:, :-1])) -
(c * dt / dy * (un[1:, 1:] - un[:-1, 1:])))
# Apply boundary conditions
u[0, :] = 1.
u[-1, :] = 1.
u[:, 0] = 1.
u[:, -1] = 1.
#NBVAL_IGNORE_OUTPUT
# A small sanity check for auto-testing
assert (u[45:55, 45:55] > 1.8).all()
u_ref = u.copy()
plot_field(u, zmax=4.)
Hooray, the wave moved! It looks like the solver works much better for this example: The wave has not noticeably changed its shape.
Again, we can re-create this via a Devito operator. Let's fill the initial buffer with smooth data and look at it:
#NBVAL_IGNORE_OUTPUT
from devito import Grid, TimeFunction
grid = Grid(shape=(nx, ny), extent=(2., 2.))
u = TimeFunction(name='u', grid=grid)
init_smooth(field=u.data[0], dx=dx, dy=dy)
plot_field(u.data[0])
Allocating memory for u ((2, 81, 81))
We again create the discretised equation as shown below. Note that the equation is still the same, only the initial condition has changed.
from sympy import Eq
eq = Eq(u.dt + c*u.dxl + c*u.dyl)
print(eq)
Eq(1.0*u(t, x, y)/h_y - 1.0*u(t, x, y - h_y)/h_y + 1.0*u(t, x, y)/h_x - 1.0*u(t, x - h_x, y)/h_x - u(t, x, y)/dt + u(t + dt, x, y)/dt, 0)
SymPy can re-organise this equation just like in the previous example.
from sympy import solve
stencil = solve(eq, u.forward, rational=False)[0]
print(stencil)
(1.0*dt*h_x*(-u(t, x, y) + u(t, x, y - h_y)) + 1.0*dt*h_y*(-u(t, x, y) + u(t, x - h_x, y)) + h_x*h_y*u(t, x, y))/(h_x*h_y)
We can now use this stencil expression to create an operator to apply to our data object:
#NBVAL_IGNORE_OUTPUT
from devito import Operator
# Reset our initial condition in both buffers.
# This is required to avoid 0s propagating into
# our solution, which has a background value of 1.
init_smooth(field=u.data[0], dx=dx, dy=dy)
init_smooth(field=u.data[1], dx=dx, dy=dy)
# Create an operator that updates the forward stencil point
op = Operator(Eq(u.forward, stencil))
# Apply the operator for a number of timesteps
op(time=nt+1, dt=dt)
plot_field(u.data[0, :, :], zmax=4)
# Some small sanity checks for the testing framework
assert (u.data[0, 45:55, 45:55] > 1.8).all()
assert np.allclose(u.data[0], u_ref, rtol=3.e-2)
DSE: extract_time_invariants [flops: 14, elapsed: 0.01] >> eliminate_inter_stencil_redundancies [flops: 14, elapsed: 0.01] >> eliminate_intra_stencil_redundancies [flops: 14, elapsed: 0.01] >> factorize [flops: 13, elapsed: 0.01] >> finalize [flops: 13, elapsed: 0.00] [Total elapsed: 0.05 s] DLE: avoid_denormals [elapsed: 0.00] >> loop_fission [elapsed: 0.01] >> loop_blocking [elapsed: 0.04] >> simdize [elapsed: 0.41] >> create_elemental_functions [elapsed: 0.01] >> minimize_remainders [elapsed: 0.01] [Total elapsed: 0.47 s]
[dt, h_x, h_y, u(t, x, y), t, time, x, y, timers] [u, Depends: [(gets_value_from:u(t, x, y))], t_size, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], t_s, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], t_e, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], x_size, Depends: [(gets_value_from:x, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], x_s, Depends: [(gets_value_from:x, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], x_e, Depends: [(gets_value_from:x, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], y_size, Depends: [(gets_value_from:y, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], y_s, Depends: [(gets_value_from:y, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], y_e, Depends: [(gets_value_from:y, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], time_size, Depends: [(gets_value_from:time, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])])], time_s, Depends: [(gets_value_from:time, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])])], time_e, Depends: [(gets_value_from:time, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])])], dt, Depends: [(gets_value_from:dt)], h_x, Depends: [(gets_value_from:h_x)], h_y, Depends: [(gets_value_from:h_y)], timers, Depends: [(gets_value_from:timers)]]
GNUCompiler: compiled /var/folders/3l/2zfpyh3n5c357t10vpx38ygm0000gn/T/devito-o8vnu98k/8337b06d55d7f91588e5fc6a92c23ac017c530de.c [0.47 s] ========================================================================================= Section main<100,80,80> with OI=1.57 computed in 0.006 s [1.46 GFlops/s, 0.11 GPts/s] =========================================================================================
Again, this looks just like the result from NumPy. Since this example is just like the one before, the low-level treatment of boundaries is also unchanged.
#NBVAL_IGNORE_OUTPUT
# Reset our data field and ICs in both buffers
init_smooth(field=u.data[0], dx=dx, dy=dy)
init_smooth(field=u.data[1], dx=dx, dy=dy)
# For defining BCs, we want to explicitly set rows/columns in our field
# We can use Devito's "indexed" notation to do this:
x, y = grid.dimensions
t = grid.stepping_dim
bc_left = Eq(u.indexed[t + 1, 0, y], 1.)
bc_right = Eq(u.indexed[t + 1, nx-1, y], 1.)
bc_top = Eq(u.indexed[t + 1, x, ny-1], 1.)
bc_bottom = Eq(u.indexed[t + 1, x, 0], 1.)
# Now combine the BC expressions with the stencil to form operator
expressions = [Eq(u.forward, stencil)]
expressions += [bc_left, bc_right, bc_top, bc_bottom]
op = Operator(expressions=expressions, dle=None, dse=None) # <-- Turn off performance optimisations
op(time=nt+1, dt=dt)
plot_field(u.data[0, :, :])
# Some small sanity checks for the testing framework
assert (u.data[0, 45:55, 45:55] > 1.8).all()
assert np.allclose(u.data[0], u_ref, rtol=3.e-2)
[dt, h_x, h_y, u(t, x, y), t, time, x, y, timers] [u, Depends: [(gets_value_from:u(t, x, y))], t_size, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], t_s, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], t_e, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], x_size, Depends: [(gets_value_from:x, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], x_s, Depends: [(gets_value_from:x, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], x_e, Depends: [(gets_value_from:x, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], y_size, Depends: [(gets_value_from:y, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], y_s, Depends: [(gets_value_from:y, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], y_e, Depends: [(gets_value_from:y, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])], time_size, Depends: [(gets_value_from:time, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])])], time_s, Depends: [(gets_value_from:time, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])])], time_e, Depends: [(gets_value_from:time, Depends: [(gets_value_from:t, Depends: [(gets_value_from:u, Depends: [(gets_value_from:u(t, x, y))])])])], dt, Depends: [(gets_value_from:dt)], h_x, Depends: [(gets_value_from:h_x)], h_y, Depends: [(gets_value_from:h_y)], timers, Depends: [(gets_value_from:timers)]]
GNUCompiler: compiled /var/folders/3l/2zfpyh3n5c357t10vpx38ygm0000gn/T/devito-o8vnu98k/30ffde093b85ff2a5e4eebb1b74d7369ad2e8289.c [0.56 s] ========================================================================================= Section section_1<100,80> with OI=0.00 computed in 0.000 s [0.00 GFlops/s] Section section_2<100,80> with OI=0.00 computed in 0.000 s [0.00 GFlops/s] Section main<100,80,80> with OI=1.69 computed in 0.007 s [1.28 GFlops/s, 0.09 GPts/s] =========================================================================================
The C code of the Kernel is also still the same.
print(op.ccode)