Notebook

Using the Python-Numpy Frontend in DaCe¶

In this tutorial, we will see how one can write code using the python-numpy frontend. The frontend supports a subset of the python language and array/matrix operations inspired by the numpy module.

Let's start with a first example that will showcase the basic elements of the DaCe program. First, we import the dace and numpy modules:

In [1]:

import dace
import numpy as np

Then, we declare the program parameters, which can be either symbols or constants:

In [2]:

M, N, K = 24, 24, 24

We proceed by writing the DaCe program as a regular python method, that is annotated with the dace.program annotation. The parameters of the python method must have type annotations. For example, below we define a gemm method, for implementing the generalized matrix-matrix multiplication operation. The first 3 parameters are the 32-bit floating-point matrices A, B and C. The last 2 parameters are the 32-bit floating-point scalar values alpha and beta. The implementation of the method is written the same way as in python, using the numpy module.

In [3]:

@dace.program
def gemm(A: dace.float32[M, K], B: dace.float32[K, N], C: dace.float32[M, N],
         alpha: dace.float32, beta: dace.float32):
    C[:] = alpha * A @ B + beta * C

The [:] slice expression is representing the whole range of the array/matrix. Note that in DaCe you are not allowed to redefine data in the same DaCe program. Therefore, if you define an array C, you may not assign to it a different value. For example, if we changed the implementation of the gemm method to C = alpha * A @ B + beta * C, we would get an error.

The DaCe program may be parsed to an SDFG and/or compiled using the the same methods from the SDFG API:

In [4]:

sdfg = gemm.to_sdfg()

In [5]:

sdfg

Out[5]:

Supported Python/Numpy operators¶

The frontend supports the unary operators {+, -, not, ~}. Note that the not operator works the same way as the numpy.logical_not method.

In [6]:

@dace.program
def uadd(A: dace.int64[5, 5], B: dace.int64[5, 5]):
    B[:] = +A

@dace.program
def usub(A: dace.int64[5, 5], B: dace.int64[5, 5]):
    B[:] = -A

@dace.program
def logicalnot(A: dace.bool[5, 5], B: dace.bool[5, 5]):
    B[:] = not A

@dace.program
def invert(A: dace.int64[5, 5], B: dace.int64[5, 5]):
    B[:] = ~A

The frontend support the binary operators {+, -, *, /, //, %, **, @, <<, >>, |, ^, &, and, or, ==, !=, <, <=, >, >=}. Note that the boolean operators {and, or} work the same way as the methods numpy.logical_and and numpy.logical_or. Apart from the matrix-multiplication operator @, all the other operators are point-wise. Note that the return type of the operators is the one returned by numpy.result_type. Exception to that are the boolean operators, which return boolean values.

Python augmented assignments are supported for all operators. Some examples follow:

In [7]:

@dace.program
def augfloordiv(A: dace.int64[5, 5], B: dace.int64[5, 5]):
    B //= A

@dace.program
def augmod(A: dace.int64[5, 5], B: dace.int64[5, 5]):
    B %= A

@dace.program
def augpow(A: dace.int64[5, 5], B: dace.int64[5, 5]):
    B **= A

@dace.program
def auglshift(A: dace.int64[5, 5], B: dace.int64[5, 5]):
    B <<= A

Operations between arrays/matrices and scalars or arrays of size 1 behave the same way as with numpy:

In [8]:

@dace.program
def addscalar(A: dace.int64[5, 5], B: dace.int64, C: dace.int64[5, 5]):
    C[:] = A + B

@dace.program
def floordivnumber(A: dace.int64[5, 5], C: dace.int64[5, 5]):
    C[:] = A // 5

Defining Data, Maps and Sequential-Loops¶

Transient arrays can be defined with dace.define_local or just numpy.ndarray. Furthermore, transient scalars can be defined with dace.define_local_scalar.

In [9]:

@dace.program
def transient(A: dace.float32[M, N, K]):
    s = np.ndarray(shape=(M, N, K), dtype=np.int32)
    t = dace.define_local(A.shape, A.dtype)
    s[:] = A
    t[:] = A

Python for-loops are automatically converted to control-flow:

In [10]:

N, BS = (dace.symbol(name) for name in ['N', 'BS'])

@dace.program
def forloop(HD: dace.complex128[N, BS, BS], HE: dace.complex128[N, BS, BS],
             HF: dace.complex128[N, BS, BS],
             sigmaRSD: dace.complex128[N, BS, BS],
             sigmaRSE: dace.complex128[N, BS, BS],
             sigmaRSF: dace.complex128[N, BS, BS]):

    for n in range(N):
        if n < N - 1:
            HE[n] -= sigmaRSE[n]
        else:
            HE[n] = -sigmaRSE[n]
        if n > 0:
            HF[n] -= sigmaRSF[n]
        else:
            HF[n] = -sigmaRSF[n]
        HD[n] = HD[n] - sigmaRSD[n]
        
forloop.to_sdfg()

Out[10]:

Maps (parallel for-loops) can be created with dace.map:

In [11]:

Nkz, NE, Nqz, Nw, N3D, NA, NB, Norb = (
    dace.symbol(name)
    for name in ['Nkz', 'NE', 'Nqz', 'Nw',
                 'N3D', 'NA', 'NB', 'Norb'])

@dace.program
def maptest(neigh_idx: dace.int32[NA, NB],
            dH: dace.complex128[NA, NB, N3D, Norb, Norb],
            G: dace.complex128[Nkz, NE, NA, Norb, Norb],
            D: dace.complex128[Nqz, Nw, NA, NB, N3D, N3D],
            Sigma: dace.complex128[Nkz, NE, NA, Norb, Norb]):

    for k, E, q, w, i, j, a, b in dace.map[0:Nkz, 0:NE,
                                           0:Nqz, 0:Nw,
                                           0:N3D, 0:N3D,
                                           0:NA, 0:NB]:
        dHG = G[k-q, E-w, neigh_idx[a, b]] @ dH[a, b, i]
        dHD = dH[a, b, j] * D[q, w, a, b, i, j]
        Sigma[k, E, a] += dHG @ dHD
        
maptest.to_sdfg()

Out[11]:

Combining explicit dataflow with numpy¶

The python-numpy syntax can be used in combination with the explicit dataflow syntax:

In [12]:

N = dace.symbol('N')

@dace.program
def slicetest(A: dace.float64[N, N - 1], B: dace.float64[N - 1, N],
              C: dace.float64[N - 1, N - 1]):
    tmp = A[1:N] * B[:, 0:N - 1]
    for i, j in dace.map[0:4, 0:4]:
        with dace.tasklet:
            t << tmp[i, j]
            c >> C[i, j]
            c = t

In [13]:

@dace.program
def saoptest(A: dace.float64[5, 5], alpha: dace.float64,
             B: dace.float64[5, 5]):
    tmp = alpha * A * 5
    for i, j in dace.map[0:5, 0:5]:
        with dace.tasklet:
            t << tmp[i, j]
            c >> B[i, j]
            c = t

Other operations and methods¶

Reductions can be defined with dace.reduce:

In [14]:

W = dace.symbol('W')
H = dace.symbol('H')

@dace.program(dace.float32[H, W], dace.float32[H, W], dace.float32[1])
def mapreduce_test(A, B, sum):
    tmp = dace.define_local([H, W], dace.float32)

    @dace.map(_[0:H, 0:W])
    def compute_tile(i, j):
        a << A[i, j]
        b >> B[i, j]
        t >> tmp[i, j]

        b = a * 5
        t = a * 5

    sum[:] = dace.reduce(lambda a, b: a + b, tmp)

The frontend also supports basic math methods such as {exp, sin, cos, sqrt, log, conj, real, imag}. See the documentation for details.