import torch
Scalars are implemented as tensors that contain only one element
x = torch.tensor(3.0)
y = torch.tensor(2.0)
x + y, x * y, x / y, x**y
(tensor(5.), tensor(6.), tensor(1.5000), tensor(9.))
You can think of a vector as a fixed-length array of scalars
x = torch.arange(3)
x
tensor([0, 1, 2])
We access a tensor's elements via indexing
x[2]
tensor(2)
In code, this corresponds to the tensor's length
len(x)
3
Tensors with just one axis have shapes with just one element
x.shape
torch.Size([3])
We can convert any appropriately sized $m \times n$ tensor into an $m \times n$ matrix
A = torch.arange(6).reshape(3, 2)
A
tensor([[0, 1], [2, 3], [4, 5]])
Matrix's transpose
A.T
tensor([[0, 2, 4], [1, 3, 5]])
Symmetric matrices are the subset of square matrices that are equal to their own transposes: $\mathbf{A} = \mathbf{A}^\top$
A = torch.tensor([[1, 2, 3], [2, 0, 4], [3, 4, 5]])
A == A.T
tensor([[True, True, True], [True, True, True], [True, True, True]])
Tensors give us a generic way of describing extensions to $n^{\textrm{th}}$-order arrays
torch.arange(24).reshape(2, 3, 4)
tensor([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]])
A = torch.arange(6, dtype=torch.float32).reshape(2, 3)
B = A.clone()
A, A + B
(tensor([[0., 1., 2.], [3., 4., 5.]]), tensor([[ 0., 2., 4.], [ 6., 8., 10.]]))
Elementwise product of two matrices is called their Hadamard product
A * B
tensor([[ 0., 1., 4.], [ 9., 16., 25.]])
Adding or multiplying a scalar and a tensor
a = 2
X = torch.arange(24).reshape(2, 3, 4)
a + X, (a * X).shape
(tensor([[[ 2, 3, 4, 5], [ 6, 7, 8, 9], [10, 11, 12, 13]], [[14, 15, 16, 17], [18, 19, 20, 21], [22, 23, 24, 25]]]), torch.Size([2, 3, 4]))
The sum of a tensor's elements
x = torch.arange(3, dtype=torch.float32)
x, x.sum()
(tensor([0., 1., 2.]), tensor(3.))
Sums over the elements of tensors of arbitrary shape
A.shape, A.sum()
(torch.Size([2, 3]), tensor(15.))
Specify the axes along which the tensor should be reduced
A.shape, A.sum(axis=0).shape
(torch.Size([2, 3]), torch.Size([3]))
A.shape, A.sum(axis=1).shape
(torch.Size([2, 3]), torch.Size([2]))
A.sum(axis=[0, 1]) == A.sum()
tensor(True)
A related quantity is the mean, also called the average
A.mean(), A.sum() / A.numel()
(tensor(2.5000), tensor(2.5000))
A.mean(axis=0), A.sum(axis=0) / A.shape[0]
(tensor([1.5000, 2.5000, 3.5000]), tensor([1.5000, 2.5000, 3.5000]))
Keep the number of axes unchanged
sum_A = A.sum(axis=1, keepdims=True)
sum_A, sum_A.shape
(tensor([[ 3.], [12.]]), torch.Size([2, 1]))
Divide A
by sum_A
with broadcasting
A / sum_A
tensor([[0.0000, 0.3333, 0.6667], [0.2500, 0.3333, 0.4167]])
The cumulative sum of elements of A
along some axis
A.cumsum(axis=0)
tensor([[0., 1., 2.], [3., 5., 7.]])
The dot product of two vectors is a sum over the products of the elements at the same position
y = torch.ones(3, dtype = torch.float32)
x, y, torch.dot(x, y)
(tensor([0., 1., 2.]), tensor([1., 1., 1.]), tensor(3.))
We can calculate the dot product of two vectors by performing an elementwise multiplication followed by a sum
torch.sum(x * y)
tensor(3.)
The matrix--vector product $\mathbf{A}\mathbf{x}$ is simply a column vector of length $m$, whose $i^\textrm{th}$ element is the dot product $\mathbf{a}^\top_i \mathbf{x}$
A.shape, x.shape, torch.mv(A, x), A@x
(torch.Size([2, 3]), torch.Size([3]), tensor([ 5., 14.]), tensor([ 5., 14.]))
We can think of the matrix--matrix multiplication $\mathbf{AB}$ as performing $m$ matrix--vector products or $m \times n$ dot products and stitching the results together to form an $n \times m$ matrix
B = torch.ones(3, 4)
torch.mm(A, B), A@B
(tensor([[ 3., 3., 3., 3.], [12., 12., 12., 12.]]), tensor([[ 3., 3., 3., 3.], [12., 12., 12., 12.]]))
The $\ell_2$ norm $$\|\mathbf{x}\|_2 = \sqrt{\sum_{i=1}^n x_i^2}$$
u = torch.tensor([3.0, -4.0])
torch.norm(u)
tensor(5.)
The $\ell_1$ norm $$\|\mathbf{x}\|_1 = \sum_{i=1}^n \left|x_i \right|$$
torch.abs(u).sum()
tensor(7.)
The Frobenius norm, which is much easier to compute $$\|\mathbf{X}\|_\textrm{F} = \sqrt{\sum_{i=1}^m \sum_{j=1}^n x_{ij}^2}$$
torch.norm(torch.ones((4, 9)))
tensor(6.)