Matrix operations¶

Arithmetic, element-by-element operations¶

There is a number of operations with matrices, we start with arithmetic operations. Imagine, we have two matrices of the size $2\times3$:

In [1]:

a = [10 10 10; 20 20 20]
b = [1 2 3; 4 5 6]

a =

   10   10   10
   20   20   20

b =

   1   2   3
   4   5   6

We start with operations that work element-by-element, that is, each element of the first matrix is combined with the corresponding element of the second matrix

add + (the same as .+)
subtract (the same as .-)
multiply .*
divide ./
left divide .\
to the power.^ (the same as .**)

here is a short reminder about, for example, adding matrices https://www.mathsisfun.com/algebra/matrix-introduction.html

In [2]:

a_add_b = a + b  # symbol _ is just a part of a variable name
a_sub_b = a - b
a_mul_b = a .* b
a_div_b = a ./ b
b_div_a = a .\ b  # this is the same as b ./ a
a_pow_b = a .^ b

a_add_b =

   11   12   13
   24   25   26

a_sub_b =

    9    8    7
   16   15   14

a_mul_b =

    10    20    30
    80   100   120

a_div_b =

   10.0000    5.0000    3.3333
    5.0000    4.0000    3.3333

b_div_a =

   0.10000   0.20000   0.30000
   0.20000   0.25000   0.30000

a_pow_b =

         10        100       1000
     160000    3200000   64000000

In [3]:

eye(3) + ones(3)  # just another example, here we add an identity matrix to a matrix of ones

This operations, of cause, also work for numbers. That is because numbers are just matrices of size $1\times1$:

In [4]:

2 + 2

ans =  4

Matrix arithmetic operations¶

Matrix multiplication¶

The next operations treat their arguments as matrices. For example, matrix multiplication is very different from element-by-element multiplication, it is defined in the Linear Algebra. Look here to remember https://www.mathsisfun.com/algebra/matrix-multiplying.html

In [5]:

id = [1 0 0; 0 1 0; 0 0 1] # an identity matrix, the same as eye(3)
a = [1 2 3; 3 2 1; -1 -2 -3]
b = [1 -1; 2 -2; 0 1]
c = [1 1 2] # a row-matrix
d = [1; 1; 2] # a column-matrix
id_mul_a = id * a  # multiplication by id has no effect, the result is a
id_mul_b = id * b  # multiplication by id has no effect, the result is b
a_mul_b = a * b
b * a # you can not do this, because sizes of b and a are do not agree 
c_mul_d = c * d # when you multiply row by column you get only one number: 1*1 + 1*1 + 2 * 2 = 6
d_mul_c = d * c # when you multiply column by row, you get a big matrix with all possible multiplications of matrices' elements

id =

   1   0   0
   0   1   0
   0   0   1

a =

   1   2   3
   3   2   1
  -1  -2  -3

b =

   1  -1
   2  -2
   0   1

c =

   1   1   2

d =

   1
   1
   2

id_mul_a =

   1   2   3
   3   2   1
  -1  -2  -3

id_mul_b =

   1  -1
   2  -2
   0   1

a_mul_b =

   5  -2
   7  -6
  -5   2

error: operator *: nonconformant arguments (op1 is 3x2, op2 is 3x3)
c_mul_d =  6
d_mul_c =

   1   1   2
   1   1   2
   2   2   4

Note again that A * B and A .* B are very different operations:

In [6]:

[1 2 3] .* [1 2 3] # this works
[1 2 3] * [1 2 3] # this raises an error, beacuse the sizes of matrices do not agree

ans =

   1   4   9

error: operator *: nonconformant arguments (op1 is 1x3, op2 is 1x3)

Matrix division¶

You can safely skip everything about division, beacuse this is not used in our assignments

The division is the operation, that is inverse to the multiplication. If A = B * C, then B = A / C, and C = B \ A. This operations are defined in such a way that (X / Y) * Y = X and X * (X \ Y) = Y.

This can also be defined as X / Y = X * inv(Y) and X \ Y = inv(X) * Y, where inv(X) is the inversion of the matrix X. https://www.mathsisfun.com/algebra/matrix-inverse.html. But the last definition works only if there exists an inverse of X.

In [7]:

a = [1 2; 3 4]
b = [1 2; 1 3]
a_mul_b = a * b
should_be_a = a_mul_b / b
should_be_b = a \ a_mul_b

a =

   1   2
   3   4

b =

   1   2
   1   3

a_mul_b =

    3    8
    7   18

should_be_a =

   1   2
   3   4

should_be_b =

   1.00000   2.00000
   1.00000   3.00000

we can also multiply and divide non-square matrices, but division will most probably give a different result

In [8]:

a = [1 2; 3 4; 5 6]
b = [2; 3]
a_mul_b = a * b # let's multiply two matrices of different sizes
not_exactly_a = a_mul_b / b # the result is different from a
the_same_as_a_mul_b = not_exactly_a * b # but the result after multiplication is still as needed

a =

   1   2
   3   4
   5   6

b =

   2
   3

a_mul_b =

    8
   18
   28

not_exactly_a =

   1.2308   1.8462
   2.7692   4.1538
   4.3077   6.4615

the_same_as_a_mul_b =

    8.0000
   18.0000
   28.0000

Matrix power¶

The powering operation works only for numbers and square matrices:

In [9]:

2 ^ 3 # the same as 2 ** 3
pow3 = [1 1; 0 1] ^ 3
pow3 = [1 1; 0 1] * [1 1; 0 1] * [1 1; 0 1] # this is the same

2 ^ [1 2; 3 4] # raising to the power of a matrix is an advanced operation, we are not going to use it 
2 .^ [1 2; 3 4] # but raising to the power element-by-element still gives the simple and expected result

ans =  8
pow3 =

   1   3
   0   1

pow3 =

   1   3
   0   1

ans =

   10.483   14.152
   21.228   31.711

ans =

    2    4
    8   16

Element-by-element operations with operands of different size (broadcasting)¶

The feature that we are going to discuss is very usefull in practice, and we have some assignments that are easily solved with it.

Imaging that we sum up matrices of different sizes:

In [10]:

a = [1 2 3; 2 3 4; 3 4 5]
b = [100; 200; 300]
a_add_b = a + b

a =

   1   2   3
   2   3   4
   3   4   5

b =

   100
   200
   300

a_add_b =

   101   102   103
   202   203   204
   303   304   305

Here the matrix b has only one column instead of 3, as in a. In this situation, the column is repeated three times before addition:

In [11]:

b_expanded = [b b b] # this is done automatically before the addition
a_add_b = a + b_expanded # the same as a + b

b_expanded =

   100   100   100
   200   200   200
   300   300   300

a_add_b =

   101   102   103
   202   203   204
   303   304   305

In the next example, the first operand is expanded to the size of the second operand:

In [12]:

1 + [100 200 300; 400 500 600]
ones(2, 3) + [100 200 300; 400 500 600] # this is the same

ans =

   101   201   301
   401   501   601

ans =

   101   201   301
   401   501   601

So, if we add a number to a matrix, the number is added to each element of the matrix.

In the next example, both operands are expanded:

In [13]:

a = [1; 2; 3]
b = [10 20 30]
a_add_b = a + b
a_expanded = [a a a]
b_expanded = [b ; b ; b]
a_add_b = a_expanded + b_expanded  # this is the same as a + b

a =

   1
   2
   3

b =

   10   20   30

a_add_b =

   11   21   31
   12   22   32
   13   23   33

a_expanded =

   1   1   1
   2   2   2
   3   3   3

b_expanded =

   10   20   30
   10   20   30
   10   20   30

a_add_b =

   11   21   31
   12   22   32
   13   23   33

In other words, we can understand the sum of a column $a_i$ and a row $b_j$ as a matrix $c_{i,j} = a_i + b_j$, where $i$ is an index of a row, and $j$ is an index of a column.

Untill now all our examples were about the sum operation. Consider now examples for different operations:

In [14]:

[1 2 3] - 1
[1 2; 3 4] .* 2
[1 2 3] .* [1; 2; 3]
1 ./ (1:10)
eye(3) ./ [1 10 100]

ans =

   0   1   2

ans =

   2   4
   6   8

ans =

   1   2   3
   2   4   6
   3   6   9

ans =

 Columns 1 through 8:

   1.00000   0.50000   0.33333   0.25000   0.20000   0.16667   0.14286   0.12500

 Columns 9 and 10:

   0.11111   0.10000

ans =

   1.00000   0.00000   0.00000
   0.00000   0.10000   0.00000
   0.00000   0.00000   0.01000

The matrix multiplication * and division /, \ operations don't expand their operands, but there is a very usefull special case: they may be used with numbers:

In [15]:

2 * ones(5) # multiply all elements by 2 (the same as .*)
[1 2 3] / 3 # divide each element by 3 (the same as ./)

ans =

   2   2   2   2   2
   2   2   2   2   2
   2   2   2   2   2
   2   2   2   2   2
   2   2   2   2   2

ans =

   0.33333   0.66667   1.00000

Vectorization¶

This is an important concept, because it is what differs octave programming from programming in another general purpose languages such as C/C++, Java, Python.

Consider, for example, the addition operation +. You just write a + b and all elements are added similtaneously. If you worked in C/C++ with two dimensionals arrays, you should have wrote a nested loop that iterated through rows and columns of the matrices to add elements. There are loops in Octave also (we will look at them later), but you can usually avoid them. This makes code easier to understand and even faster.

Let's look at the example, how one can avoid loops, using operations that we already know. Imagine, we have a row vector a and we want to find differences between each consequative elements:

In [16]:

a = [1 3 10 13 14] # this is what we have
b = [2 7 3 1] # this is what we want to get

a =

    1    3   10   13   14

b =

   2   7   3   1

$2 = 3 - 1$, $7 = 10 - 3$, $3 = 13 - 10$, $1 = 14 - 13$. Note that size of b is smaller than the size of a by 1. Look at the equalities in this paragraph. The first operands, from which we subtract are [3, 10, 13, 1], and the second operands, that we subtract are: [1, 3, 10, 13].

But that means, that we subtract from a without the last element. And what we subtract is a without the first element. That is, we can just write:

In [17]:

a(2:end) - a(1:end-1)

ans =

   2   7   3   1

In fact, there is a built-in function for the same:

In [18]:

diff(a)

ans =

   2   7   3   1

So, when you write a program and want to use a loop, that is to repeat some action several times, think at first, whether it is possible to avoid looping by means of broadcasting, indexing, etc. This is usually possible for loops with simple bodies.

Other operations¶

Transposition¶

If you want to recall a definition: https://www.mathsisfun.com/algebra/matrix-introduction.html.

In [19]:

a = [1 2; 3 4; 5 6]
a'
b = 1:5
b'
(1:5)' # the same as previuos, but without an extra variable

a =

   1   2
   3   4
   5   6

ans =

   1   3   5
   2   4   6

b =

   1   2   3   4   5

ans =

   1
   2
   3
   4
   5

ans =

   1
   2
   3
   4
   5

Inverse and determinant¶

We do not use this in assignments, but it is a kind of obligatory to introduce these functions when learning Octave:

In [20]:

a = [1 2; 1 3]
a_inv = inv(a)
a * a_inv # the identity matrix eye(2)
det(a) # the determinant

a =

   1   2
   1   3

a_inv =

   3  -2
  -1   1

ans =

   1   0
   0   1

ans =  1

There are many more basic matrix operations: https://octave.org/doc/v4.4.0/Basic-Matrix-Functions.html

Mathematical functions¶

There is a number of mathematical functions in octave, such as sin, log, exp (sine, logarithm, exponent), they work with each element separately:

In [21]:

x = [0 pi; 1 pi/2]
sin_x = sin(x)
exp(0:4)

x =

   0.00000   3.14159
   1.00000   1.57080

sin_x =

   0.00000   0.00000
   0.84147   1.00000

ans =

    1.0000    2.7183    7.3891   20.0855   54.5982

Sum, min, max of matrix elements¶

The next operations return

the sum of elements in each matrix's column
the minimal element in each matrix's column
the maximal element in each matrix's column

In [22]:

a = [1 6 8; 5 4 9; 2 7 3]
sum(a)
min(a)
max(a)

a =

   1   6   8
   5   4   9
   2   7   3

ans =

    8   17   20

ans =

   1   4   3

ans =

   5   7   9

Remember, that matrices are stored in memory by columns, that is why the default operation of these functions is by columns.

But if a matrix is one dimentional, either a row or a column, these operations work with all elements of a matrix:

In [23]:

sum([2 1 3]) # although, we have 3 columns, the function returns only one result
min([2 1 3])
max([2 1 3])

ans =  6
ans =  1
ans =  3

How to get sums by rows instead of columns? (use transposition), and how to find the sum of all elements of a two dimentional matrix? (call sum two times).

But you can also type help sum to find out, how to get sums by rows instead of columns, and how to find the sum of all elements of a two dimentional matrix by just one call to sum with special arguments.

Answer the same questions about min and max.

It is also possible to not only find minimums, but also find an exact position of minimal elements in each column. We will call a function in such a way, that it returns several results, we use square brackets for this before an assignment operator. Some functions may return several results.

In [24]:

a = [1 6 8; 5 4 9; 2 7 3]
[m, i] = min(a)

m are the minimal elements, and i gives indices of these elements. The minimal element of the first column is in the row 1, of the second column in the row 2, of the third column in the row 3.

Comparison and Logical operations¶

Consider the operation ==. It compares values:

In [25]:

10 == 10
20 == 30
[10 20 30] == [10 5 30]

ans = 1
ans = 0
ans =

  1  0  1

1 means true, that is, the statement about equality is true. And 0 means false. You see, that for matrices elements are compared element-by-element. Broadcasting also works here:

In [26]:

a = [1 2 3; 2 3 4; 3 4 5]
a_eq_3 = a == 3

There are other comparison operations: ~=, >, <, >=, <=, that mean correspondingly: not equal, greater, less, greater or equal, less or equal:

In [27]:

a_not_eq_3 = a ~= 3
a_gr_3 = a > 3
a_gr_eq_3 = a >= 3
a_less_3 = a < 3
a_less_eq_3 = a <= 3

a_not_eq_3 =

  1  1  0
  1  0  1
  0  1  1

a_gr_3 =

  0  0  0
  0  0  1
  0  1  1

a_gr_eq_3 =

  0  0  1
  0  1  1
  1  1  1

a_less_3 =

  1  1  0
  1  0  0
  0  0  0

a_less_eq_3 =

  1  1  1
  1  1  0
  1  0  0

There are logical operations and (&), or (|), not (~). If you write x & y, both operands must be true for the result to also be true:

In [28]:

& 0
& 0
& 1
& 1

ans = 0
ans = 0
ans = 0
ans = 1

If you write x | y, at least one operand must be true for the result to also be true:

In [29]:

| 0
| 0
| 1
| 1

ans = 0
ans = 1
ans = 1
ans = 1

Not operation inverts its operand:

In [30]:

~0
~1

ans = 1
ans = 0

It is usual to combine comparison and logical operations. Note that for matrices boolean operations work element-by-element:

In [31]:

a = [1 2 3; 2 3 4; 3 4 5]
a_from_2_to_4 = a >= 2 & a <= 4
a_not_in_2_3 = a < 2 | a > 3

a =

   1   2   3
   2   3   4
   3   4   5

a_from_2_to_4 =

  0  1  1
  1  1  1
  1  1  0

a_not_in_2_3 =

  1  0  0
  0  0  1
  0  1  1

More on indexing (logical indexing)¶

Imagine, we want to get all elements of some vector, that are greater than 10. Let's start with finding which elements are greater, and which are not:

In [32]:

x = [1 10 2 20 3 30 40]
x_gr_10 = x > 10

x =

    1   10    2   20    3   30   40

x_gr_10 =

  0  0  0  1  0  1  1

Now we can use a find function. This function takes a matrix and find indexes of all nonzero elements:

In [33]:

find([10 0 0 3 0 0 -4]) # just an arbitrary example of searching for non-zero elements
find(x_gr_10)

The find function combines well with logical matrices, because it allows finding indexes of all true values. More examples on this:

In [34]:

find([100 200 300] == 200) # the 2nd element is 200
find([100 200 300] >= 200) # the 2nd and the 3rd elements are greater or equal than 200

ans =  2
ans =

   2   3

So, now, we can finally get al elements of x that are greater than ten

In [35]:

x # to remind the value of x
x_gr_10 = x > 10
find(x_gr_10)
x(find(x_gr_10)) # we know indexes of elements, so we can just index by that indexes:

x =

    1   10    2   20    3   30   40

x_gr_10 =

  0  0  0  1  0  1  1

ans =

   4   6   7

ans =

   20   30   40

Let's now write it as one line and let's also modify elements:

In [36]:

x(find(x > 10)) # see all elements, that are greater than 10
x(find(x > 10)) = 10  # assign 10 to each element

ans =

   20   30   40

x =

    1   10    2   10    3   10   10

More examples of logical operations inside indexing:

In [37]:

x = 1:16
x_div_3 = x(find(mod(x, 3) == 0)) # filter elements that are divisible by 3
x_from_5_to_10 = x(find(5 <= x & x <= 10)) # all elements from 5 to 10

# cross matrix indexing
a = [10 20 30]
b = [100 200 300]
# we find elements from a, that are greater than 20, and take corresponding elements from b:
b(find(a >= 20))

# remember, that you may assign to an indexed matrix
# in this case you assign only to indexed elements
x(find(mod(x, 3) == 0)) = x_div_3 + 1
# so, all elements, that are divisible by 3 increased by 1

x =

    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16

x_div_3 =

    3    6    9   12   15

x_from_5_to_10 =

    5    6    7    8    9   10

a =

   10   20   30

b =

   100   200   300

ans =

   200   300

x =

    1    2    4    4    5    7    7    8   10   10   11   13   13   14   16   16

Let's now repeat the same examples without find. The idea is, that when you index by a logical matrix, the find is made automatically (!)

In [38]:

# x = 1:16
x_div_3 = x(mod(x, 3) == 0) # the same as x(find(mod(x, 3) == 0))
x_from_5_to_10 = x(5 <= x & x <= 10)

# cross matrix indexing
a = [10 20 30]
b = [100 200 300]
b(a >= 20)

x(mod(x, 3) == 0) = x_div_3 + 1

x_div_3 = [](1x0)
x_from_5_to_10 =

    5    7    7    8   10   10

a =

   10   20   30

b =

   100   200   300

ans =

   200   300

x =

    1    2    4    4    5    7    7    8   10   10   11   13   13   14   16   16