import numpy as np
Numpy presents an n-dimensional abstraction that has to be fit into 1-dimensional computer memory.
Even for 2 dimensions (matrices), this leads to confusion: row-major, column-major.
A = np.arange(9).reshape(3, 3)
print(A)
[[0 1 2] [3 4 5] [6 7 8]]
How is this represented in memory?
A.strides
(24, 8)
strides
stores for each axis by how many bytes one needs to jump to get from one entry to the next (in that axis)We can also ask for Fortran order:
A2 = np.arange(9).reshape(3, 3, order="F")
A2
array([[0, 3, 6], [1, 4, 7], [2, 5, 8]])
numpy
defaults to row-major order.
A2.strides
(8, 24)
How is the stride model more general than just saying "row major" or "column major"?
A = np.arange(16).reshape(4, 4)
A
array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]])
A.strides
(32, 8)
Asub = A[:3, :3]
Asub
array([[ 0, 1, 2], [ 4, 5, 6], [ 8, 9, 10]])
Recall that Asub
constitutes a view of the original data in A
.
Asub.strides
(32, 8)
Now Asub
is no longer a contiguous array!
From the linear-memory representation (as show by the increasing numbers in A
) 3, 7, 11 are missing.
This is easy to check by a flag:
Asub.flags
C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False