This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.
Every array has a number of dimensions, a shape, a data type, and strides. Strides are integer numbers describing, for each dimension, the byte step in the contiguous block of memory. The address of an item in the array is a linear combination of its indices: the coefficients are the strides.
import numpy as np
id = lambda x: x.__array_interface__['data'][0]
x = np.zeros(10); x.strides
This vector contains float64 (8 bytes) items: one needs to go 8 bytes forward to go from one item to the next.
y = np.zeros((10, 10)); y.strides
In the first dimension (vertical), one needs to go 80 bytes (10 float64 items) forward to go from one item to the next, because the items are internally stored in row-major order. In the second dimension (horizontal), one needs to go 8 bytes forward to go from one item to the next.
We create a new array pointing to the same memory block as a
, but with a different shape. The strides are such that this array looks like it is a vertically tiled version of a
. NumPy is tricked: it thinks b
is a 2D n * n
array with n^2
elements, whereas the data buffer really contains only n
elements.
n = 1000; a = np.arange(n)
b = np.lib.stride_tricks.as_strided(a, (n, n), (0, 4))
b
b.size, b.shape, b.nbytes
%timeit b * b.T
This first version does not involve any copy, as b
and b.T
are arrays pointing to the same data buffer in memory, but with different strides.
%timeit np.tile(a, (n, 1)) * np.tile(a[:, np.newaxis], (1, n))
You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).
IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).