Numpy is probably the most fundamental package for scientific computing in Python.
It provides a highly efficient interface to create and interact with multidimensional arrays.
Nearly every other scientific package uses or integrates with NumPy in some way.
The performance-sensitive parts of NumPy are all written in the C language, so they are very fast.
The fundamental object of NumPy is its numpy.array
, an $n$-dimensional array that is also present in some form in array-oriented languages such as MATLAB.
import numpy as np # we import numpy as "np" to avoid typing "numpy" each time
%load_ext lab_black
vector = np.array([10, 11, 12])
print(type(vector))
<class 'numpy.ndarray'>
vector.shape
(3,)
Let’s start things off by forming a 3-dimensional array with 36 consecutive numbers
matrix = np.arange(36).reshape(3, 4, 3)
matrix
array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]], [[12, 13, 14], [15, 16, 17], [18, 19, 20], [21, 22, 23]], [[24, 25, 26], [27, 28, 29], [30, 31, 32], [33, 34, 35]]])
Picturing high-dimensional arrays in two dimensions can be difficult.
One intuitive way to think about an array’s shape is to simply say:
matrix is a 3 by 4 by 3 array
matrix.shape
(3, 4, 3)
print(f"{matrix.ndim=}")
print(f"{matrix.shape=}")
print(f"{matrix.size=}")
print(f"{matrix.dtype=}")
matrix.ndim=3 matrix.shape=(3, 4, 3) matrix.size=36 matrix.dtype=dtype('int64')
Visually, matrix
could be thought of as a container of three 4x3 grids (or a rectangular prism) and would look like this:
Data with greater than two dimensions are tough to picture, but they are everywhere in biomechanics, so better get used to it!
np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]]) # from a list
array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]])
np.arange(12).reshape(4, 3) # with consecutive numbers
array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]])
np.zeros((4, 3)) # with zeros
array([[0., 0., 0.], [0., 0., 0.], [0., 0., 0.], [0., 0., 0.]])
np.ones((4, 3)) # with ones
array([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.]])
np.empty((4, 3)) # with last value used
array([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.]])
np.random.rand(4, 3) # with random numbers
array([[0.55456578, 0.29574095, 0.00245861], [0.40599595, 0.56963864, 0.10437213], [0.83959092, 0.32007413, 0.0839273 ], [0.46797033, 0.07439805, 0.56752482]])
array = np.arange(12).reshape(4, 3)
array
array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]])
# first row second column & second row third column
array[[0, 1], [1, 2]]
array([1, 5])
# first two rows
array[:2, :]
array([[0, 1, 2], [3, 4, 5]])
# last row and columns 1 & 2
array[-1, 1:3]
array([10, 11])
Warning: a slice of an array is a view into the same data, so modifying it will modify the original array.
array > 5
array([[False, False, False], [False, False, False], [ True, True, True], [ True, True, True]])
array[array > 5]
array([ 6, 7, 8, 9, 10, 11])
More details in the mathematical functions section of the official documentation.
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])
x + y
# or np.add(x, y)
array([[ 6, 8], [10, 12]])
x - y
# or np.substract(x, y)
array([[-4, -4], [-4, -4]])
x * y
# or np.multiply(x, y)
array([[ 5, 12], [21, 32]])
x / y
# or np.divide(x, y)
array([[0.2 , 0.33333333], [0.42857143, 0.5 ]])
np.sqrt(x)
array([[1. , 1.41421356], [1.73205081, 2. ]])
Warning: unlike MATLAB, *
is elementwise multiplication, not matrix multiplication.
We instead use the dot function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices.
x @ y
# or np.dot(x, y)
array([[19, 22], [43, 50]])
np.sum(x, axis=0)
# or x.sum(axis=0)
array([4, 6])
np.sum(x, axis=1)
# or x.sum(axis=1)
array([3, 7])
x
array([[1, 2], [3, 4]])
x.T
array([[1, 3], [2, 4]])
x.reshape(-1)
array([1, 2, 3, 4])
Here are some EMG data from the anterior deltoid during a box lifting task:
import pandas as pd
import altair as alt
emg_dataframe = pd.read_csv("../data/emgs.csv").query('filename == "12kg_H2_3"')
base = alt.Chart(emg_dataframe)
line = base.mark_line().encode(x="index", y="delt_ant")
rule = base.mark_rule(color="firebrick", size=2).encode(y="mean(delt_ant)")
above_base = alt.Chart(emg_dataframe.query("delt_ant > delt_ant.mean()"))
points = above_base.mark_point(color="red").encode(x="index", y="delt_ant")
rect = above_base.mark_errorband(extent="stdev").encode(y="delt_ant")
line + points + rule + rect
emg = emg_dataframe["delt_ant"].values
From the emg
Numpy array: