This notebook contains a demonstration of new features present in the 0.45.0 release of Numba. Whilst release notes are produced as part of the CHANGE_LOG
, there's nothing like seeing code in action! It should be noted that this release does not contain as much new user facing functionality as usual, a lot of necessary work was done on Numba internals instead!
Included are demonstrations of:
@jit(parallel=True)
functions.First, import the necessary from Numba and NumPy...
from numba import jit, njit, config, __version__, errors, types
from numba.extending import overload
import numpy as np
assert tuple(int(x) for x in __version__.split('.')[:2]) >= (0, 45)
As noted in the previous release notebook, Numba Version 0.44 deprecated a number of features and issued pending-deprecation notices for others. One of the deprecations with highest impact was the pending-deprecation of reflection of List
and Set
types, the "typed-list" demonstrated herein is the replacement for the reflected list.
The first important thing to note about the typed-list is that it is instantiated (manually or through type inference) with a fixed single type and as a result its items must be homogeneous and of that type, this is similar to the typed dictionary added in Numba Version 0.43. The typed-list documentation can be found here and contains further notes and examples.
Demonstration of this feature starts with seeing how to change some code that would be impacted by the deprecation of the "reflected list":
@njit
def foo(x):
x.append(10) # changes made here need "reflecting" back to `a` in the outer scope
a = [1, 2, 3]
foo(a)
This is the same functionality but using the new typed-list:
from numba.typed import List
@njit
def foo(x):
x.append(10)
a = List() # Create a new typed-list
# Add the content to the typed-list, the list type is inferred from the items added
[a.append(x) for x in [1, 2, 3]]
foo(a) # make the call
Taking a look at the output...
from numba import typeof
print(a) # The list looks like a "normal" python list
print(type(a)) # but is actually a Numba typed-list container
print(typeof(a)) # and it is type inferred as a `ListType[int64]` (a list of int64 items)
The typed list behaves the same way both inside and outside of jitted functions, the usual operators "just work"...
def list_demo(jitted, a):
print("jitted: ", jitted)
print("input :",a)
a.pop()
print("a.pop() :", a)
a.extend(a)
print("a.extend(a) :", a)
a.reverse()
print("a.reverse() :", a)
print("slice a[::-2] :", a[::-2])
list_demo(False, a.copy()) # run the demo on a copy of 'a' in a pure python function
print("-" * 20)
njit(list_demo)(True, a.copy()) # run the demo on a copy of 'a' in a jit compiled function
Further, typed lists can contain considerably more involved structures than those supported in the reflected list implementation. For example, this is a list-of-list-of-typed-dict being returned from a jitted function:
@njit
def complicated_list_structure():
a = List()
for x in range(4):
tmp = List()
for y in range(3):
d = dict()
d[x] = y
tmp.append(d)
a.append(tmp)
return a
print(complicated_list_structure())
In the same manner as with the numba.typed.Dict
, it is also possible to instantiate a numba.typed.List
instance with a specific type. This is useful in the case that type inference cannot automatically infer the type of the list, for example, if type inference would need to cross a function call boundary. The following demonstrates:
@njit
def callee(a):
a.append(1j) # the list is a complex128 type
@njit
def untyped_caller():
x = List() # type of `x` cannot be inferred
callee(x)
return x
@njit
def typed_caller():
x = List.empty_list(types.complex128) # type of `x` is specified
callee(x)
return x
# This fails...
try:
untyped_caller()
except errors.TypingError as e:
print("Caught error: %s" % e.msg)
# This works as expected...
print("Works fine: %s" % typed_caller())
Most fortunately, with thanks to some side effects of the implementation details of the typed-list, the performance is generally good and in a number of use cases excellent, in comparison to the CPython interpreter. For example, racing a list append of all elements of a large array:
def interpreted_append(arr):
a = []
for x in arr:
a.append(x)
return a
@njit
def compiled_append(arr):
a = List()
for x in arr:
a.append(x)
return a
arr = np.random.random(int(1e6)) # array of 1e6 elements
assert interpreted_append(arr) == list(compiled_append(arr))
# Interpreter performance
interpreter = %timeit -o interpreted_append(arr)
# JIT compiled performance
jitted = %timeit -o compiled_append(arr)
print("Speed up: %sx" % np.round(interpreter.best/jitted.best, 1))
This races walking lists and accessing each element...
@njit
def walk(x):
count = 0
for v in x:
if v == True:
count += 1
return count
arr = np.random.random(int(1e6)) < 0.5 # array of 1e6 True/False elements
typed_list = List()
[typed_list.append(_) for _ in arr]
builtin_list = [_ for _ in arr]
# check the results
assert walk(typed_list) == walk.py_func(builtin_list) == walk.py_func(typed_list)
interpreter = %timeit -o walk.py_func(builtin_list)
jitted = %timeit -o walk(typed_list)
print("Speed up: %sx" % np.round(interpreter.best/jitted.best, 1))
@jit(parallel=True)
functions¶Whilst a small addition on the face of it, the ability to cache functions that are decorated with @jit(parallel=True)
is a huge improvement for users of Numba's automatic parallelisation. The parallelisation compilation path is the most involved of all those in Numba and being able to cache the compilation results for reuse should drastically improve start up performance in certain applications. A quick demonstration:
@njit(parallel=True, cache=True)
def parallel():
n = int(1e4)
x = np.zeros((n, n))
y = np.ones((n, n))
a = x + y
b = a * 2
c = a - b
d = c / y + np.sqrt(x)
e = np.sin(d) ** 2 + np.cos(d) ** 2
return e
parallel()
parallel.stats
This release contains a number of newly supported NumPy functions:
np.select
np.flatnonzero
np.bartlett
, np.hamming
, np.blackman
, np.hanning
, np.kaiser
@njit
def numpy_new():
arr = np.array([[0, 2], [3 ,0]])
# np.select
condlist = [arr == 0, arr != 0]
choicelist = [arr ** 2, arr ** 3]
print("np.select:\n", np.select(condlist, choicelist, 1))
# np.flatnonzero
print("np.flatnonzero:\n", np.flatnonzero(arr))
# windowing functions...
print("np.bartlett:\n", np.bartlett(5))
print("np.blackman:\n", np.blackman(5))
print("np.hamming:\n", np.hamming(5))
print("np.hanning:\n", np.hanning(5))
print("np.kaiser:\n", np.kaiser(5, 5))
numpy_new()
Some new features were added that don't fit anywhere in particular but are still very useful. The range
function now has accessible start
, stop
and step
attributes and as a follow-on piece of functionality, operator.contains
now works with range
. A quick demonstration:
@njit
def demo_range():
myrange = range(5, 500, 27)
print("start:", myrange.start)
print("stop :", myrange.stop)
print("step :", myrange.step)
print(32 in myrange)
print(7 in myrange)
demo_range()
Also, the inspect_types
method on the dispatcher now supports the signature
kwarg to be symmetric with respect to the other inspect_*
methods. As an example:
@njit
def add_one(x):
return x + 1
add_one(1)
add_one(1.)
add_one(1j)
print("Known signatures:", add_one.signatures)
# show the types with respect to the zeroth signature
add_one.inspect_types(signature=add_one.signatures[0], pretty=True)