Numba 0.41.0 Release Demo

This notebook contains a demonstration of new features present in the 0.41.0 release of Numba. Whilst release notes are produced as part of the CHANGE_LOG, there's nothing like seeing code in action!

Included are demonstrations of:

  • Initial support for Python 3 Unicode strings
  • Diagnostics showing the optimizations performed by ParallelAccelerator
  • Newly supported NumPy functions
  • Literal values support (for developers of Numba/Numba extensions)
  • Tracebacks from exceptions

First, import the necessary from Numba and NumPy...

In [ ]:
from numba import njit, config, __version__
from numba.extending import overload
import numpy as np
assert tuple(int(x) for x in __version__.split('.')[:2]) >= (0, 41)

Unicode strings

Initial support for Unicode strings has been implemented for Python versions >= 3.4. Support for fundamental string operations has been added as well as support for strings as arguments and return value. The next release of Numba will contain performance updates and additional features to further enhance string support.

In [ ]:
if config.PYVERSION > (3, 4): # Only supported in Python >= 3.4
    def strings_demo(str1, str2, str3):
        # strings, ---^  ---^   ---^
        # as arguments are now supported!
        # defining strings in compiled code also works
        def1 = 'numba is '
        # as do unicode strings
        def2 = '🐍⚡'
        # also string concatenation 
        print(str1 + str2)
        # comparison operations
        print(str1 == str2)
        print(str1 < str2)
        print(str1 <= str2)
        print(str1 > str2)
        print(str1 >= str2)
        # {starts,ends}with
        # len()
        print(len(str1), len(def2), len(str3))
        # str.find()
        # in
        print(str3 in str2)
        # slicing
        print(str2[1:], str1[:1])
        # and finally, strings can also be returned
        return '\nnum' + str1[1::-1] + def1[5:] + def2
    # run the demo
    print(strings_demo('abc', 'zba', 'a'))

ParallelAccelerator Optimization Diagnostics

The ParallelAccelerator technology is used when the parallel=True kwarg is supplied to @jit. This technology is what transforms the decorated function into one that can run on multiple CPUs. Whilst parallel=True has been implemented for some time, the optimizations taking place have not been exposed in a manner that is easy to understand. Numba 0.41.0 contains the first cut of a new diagnostics tool that aims to help demystify what ParallelAccelerator does internally as it transforms the function to run in parallel!

Documentation for this feature is available here, including an explanation of how to interpret the parallel diagnostics output.

In [ ]:
from numba import prange # import parallel range

# decorate a function with `parallel=True` as usual
def test(x):
    n = x.shape[0]
    a = np.sin(x)                      # parallel array expression
    b = np.cos(a * a)                  # parallel array expression
    acc = 0                            
    for i in prange(n - 2):            # user defined parallel loop
        for j in prange(n - 1):        # user defined parallel loop
            acc += b[i] + b[j + 1]     # parallel reduction
    return acc

# run the function

# access the diagnostic output via the new `parallel_diagnostics` method on the dispatcher

Newly supported NumPy functions

This release contains a number of newly supported NumPy functions:

  • Triangular matrix creation/manipulation: tri, tril, triu
  • Partioning and element wise difference computation: partition, ediff1d
  • Covariance: cov
  • NaN based reductions: nancumsum, nancumprod
  • Conjugation: conj, conjugate
In [ ]:
def numpy_new():
    # create some simple array data for use in np.tril and np.triu
    a = np.arange(12.).reshape(3, 4)
    print('Input array:')
    # try out np.tri, np.triu, np.tril
    print('np.triu(a, k=1)')
    print(np.triu(a, k=1))
    # copy and shuffle the simple array data for use with np.partition, np.ediff1d and np.cov
    a_unordered = a.copy()
    print('\nInput array:')
    # try out np.partition, np.ediff1d and np.cov
    print('np.partition(a_unordered, 0)')
    print(np.partition(a_unordered, 0))
    # create some data with NaN present to try out np.nancumsum and np.nancumprod
    a_w_nan = a.copy()
    a_w_nan.ravel()[::2] = np.nan
    print('\nInput array:')

    # try out np.nancumsum and np.nancumprod
    # finally, create some data in the complex domain to try out np.conj and np.conjugate
    a_cmplx = a.copy() + a_unordered.copy() * 1j
    print('\nInput array:')
    # try out np.conj and np.conjugate

Literal value support

Numba 0.41.0 has a significant change made to the typing system that aims to clean up the use of constants. This change takes the form of support for type specific literal values in the type inference mechanism. During typing two passes are now made, the first with anything which is a constant and can expressed as a literal set as such (integers, strings, slices and make_function are implemented as literals at present), the second with the standard types used for the constants. This, for example, permits value based dispatch as demonstrated below, but also opens up a lot of future possibilities surrounding typing which were inaccessible prior to this change.

In [ ]:
from numba import generated_jit

def myoverload(arg):
    literal_val = getattr(arg, 'literal_value', None)
    if literal_val is not None:
        if literal_val == 100:
            def impl_1(arg):
                return 'dispatched: impl_1(literal, value 100)'
            return impl_1
            def impl_2(arg):
                return 'dispatched: impl_2(literal, value not 100)'
            return impl_2
        def impl_3(arg):
                return 'dispatched: impl_3(non-literal type)'
        return impl_3

def example(x):
    print(myoverload(100))         # literal value 100, dispatches impl_1
    print(myoverload(99))          # literal value 99, dispatches impl_2
    a = 50 + 25 + 2 * 10 + 15 // 3 # `a` is const expr value 100
    print(myoverload(a))           # `a` has literal value 100, dispatches impl_1
    b = 50 * x                     # `b` non-literal, it's an intp type
    print(myoverload(b))           # `b` non-literal intp, has no value, dispatches impl_3


Exceptions with tracebacks relating to Python source.

Finally (and left to last as an exception is raised!), tracebacks from exceptions raised in jitted code now contain a synthesized stack frame containing the location where the exception was raised. The stack frame is based on the Python source from which is was compiled, it looks like a CPython traceback, but is coming from compiled code! This makes it easier to use exceptions in nopython mode as it is now possible to find out the location from which they were raised. Try commenting/uncommenting the @njit decorator and rerunning!

In [ ]:
def raise_exception(x):
    if x == 0:
        raise Exception('raised x==0. Also, exception arguments are correctly handled', 123, 4j)