There is a number of operations with matrices, we start with arithmetic operations. Imagine, we have two matrices of the size $2\times3$:
a = [10 10 10; 20 20 20]
b = [1 2 3; 4 5 6]
a = 10 10 10 20 20 20 b = 1 2 3 4 5 6
We start with operations that work element-by-element, that is, each element of the first matrix is combined with the corresponding element of the second matrix
+
(the same as .+
).-
).*
./
.\
.^
(the same as .**
)here is a short reminder about, for example, adding matrices https://www.mathsisfun.com/algebra/matrix-introduction.html
a_add_b = a + b # symbol _ is just a part of a variable name
a_sub_b = a - b
a_mul_b = a .* b
a_div_b = a ./ b
b_div_a = a .\ b # this is the same as b ./ a
a_pow_b = a .^ b
a_add_b = 11 12 13 24 25 26 a_sub_b = 9 8 7 16 15 14 a_mul_b = 10 20 30 80 100 120 a_div_b = 10.0000 5.0000 3.3333 5.0000 4.0000 3.3333 b_div_a = 0.10000 0.20000 0.30000 0.20000 0.25000 0.30000 a_pow_b = 10 100 1000 160000 3200000 64000000
eye(3) + ones(3) # just another example, here we add an identity matrix to a matrix of ones
ans = 2 1 1 1 2 1 1 1 2
This operations, of cause, also work for numbers. That is because numbers are just matrices of size $1\times1$:
2 + 2
ans = 4
The next operations treat their arguments as matrices. For example, matrix multiplication is very different from element-by-element multiplication, it is defined in the Linear Algebra. Look here to remember https://www.mathsisfun.com/algebra/matrix-multiplying.html
id = [1 0 0; 0 1 0; 0 0 1] # an identity matrix, the same as eye(3)
a = [1 2 3; 3 2 1; -1 -2 -3]
b = [1 -1; 2 -2; 0 1]
c = [1 1 2] # a row-matrix
d = [1; 1; 2] # a column-matrix
id_mul_a = id * a # multiplication by id has no effect, the result is a
id_mul_b = id * b # multiplication by id has no effect, the result is b
a_mul_b = a * b
b * a # you can not do this, because sizes of b and a are do not agree
c_mul_d = c * d # when you multiply row by column you get only one number: 1*1 + 1*1 + 2 * 2 = 6
d_mul_c = d * c # when you multiply column by row, you get a big matrix with all possible multiplications of matrices' elements
id = 1 0 0 0 1 0 0 0 1 a = 1 2 3 3 2 1 -1 -2 -3 b = 1 -1 2 -2 0 1 c = 1 1 2 d = 1 1 2 id_mul_a = 1 2 3 3 2 1 -1 -2 -3 id_mul_b = 1 -1 2 -2 0 1 a_mul_b = 5 -2 7 -6 -5 2 error: operator *: nonconformant arguments (op1 is 3x2, op2 is 3x3) c_mul_d = 6 d_mul_c = 1 1 2 1 1 2 2 2 4
Note again that A * B
and A .* B
are very different operations:
[1 2 3] .* [1 2 3] # this works
[1 2 3] * [1 2 3] # this raises an error, beacuse the sizes of matrices do not agree
ans = 1 4 9 error: operator *: nonconformant arguments (op1 is 1x3, op2 is 1x3)
You can safely skip everything about division, beacuse this is not used in our assignments
The division is the operation, that is inverse to the multiplication. If A = B * C
, then B = A / C
, and C = B \ A
.
This operations are defined in such a way that (X / Y) * Y = X
and X * (X \ Y) = Y
.
This can also be defined as X / Y = X * inv(Y)
and X \ Y = inv(X) * Y
, where inv(X)
is the inversion of the matrix X
. https://www.mathsisfun.com/algebra/matrix-inverse.html. But the last definition works only if there exists an inverse of X
.
a = [1 2; 3 4]
b = [1 2; 1 3]
a_mul_b = a * b
should_be_a = a_mul_b / b
should_be_b = a \ a_mul_b
a = 1 2 3 4 b = 1 2 1 3 a_mul_b = 3 8 7 18 should_be_a = 1 2 3 4 should_be_b = 1.00000 2.00000 1.00000 3.00000
we can also multiply and divide non-square matrices, but division will most probably give a different result
a = [1 2; 3 4; 5 6]
b = [2; 3]
a_mul_b = a * b # let's multiply two matrices of different sizes
not_exactly_a = a_mul_b / b # the result is different from a
the_same_as_a_mul_b = not_exactly_a * b # but the result after multiplication is still as needed
a = 1 2 3 4 5 6 b = 2 3 a_mul_b = 8 18 28 not_exactly_a = 1.2308 1.8462 2.7692 4.1538 4.3077 6.4615 the_same_as_a_mul_b = 8.0000 18.0000 28.0000
The powering operation works only for numbers and square matrices:
2 ^ 3 # the same as 2 ** 3
pow3 = [1 1; 0 1] ^ 3
pow3 = [1 1; 0 1] * [1 1; 0 1] * [1 1; 0 1] # this is the same
2 ^ [1 2; 3 4] # raising to the power of a matrix is an advanced operation, we are not going to use it
2 .^ [1 2; 3 4] # but raising to the power element-by-element still gives the simple and expected result
ans = 8 pow3 = 1 3 0 1 pow3 = 1 3 0 1 ans = 10.483 14.152 21.228 31.711 ans = 2 4 8 16
The feature that we are going to discuss is very usefull in practice, and we have some assignments that are easily solved with it.
Imaging that we sum up matrices of different sizes:
a = [1 2 3; 2 3 4; 3 4 5]
b = [100; 200; 300]
a_add_b = a + b
a = 1 2 3 2 3 4 3 4 5 b = 100 200 300 a_add_b = 101 102 103 202 203 204 303 304 305
Here the matrix b
has only one column instead of 3, as in a
. In this situation, the column is repeated three times before addition:
b_expanded = [b b b] # this is done automatically before the addition
a_add_b = a + b_expanded # the same as a + b
b_expanded = 100 100 100 200 200 200 300 300 300 a_add_b = 101 102 103 202 203 204 303 304 305
In the next example, the first operand is expanded to the size of the second operand:
1 + [100 200 300; 400 500 600]
ones(2, 3) + [100 200 300; 400 500 600] # this is the same
ans = 101 201 301 401 501 601 ans = 101 201 301 401 501 601
So, if we add a number to a matrix, the number is added to each element of the matrix.
In the next example, both operands are expanded:
a = [1; 2; 3]
b = [10 20 30]
a_add_b = a + b
a_expanded = [a a a]
b_expanded = [b ; b ; b]
a_add_b = a_expanded + b_expanded # this is the same as a + b
a = 1 2 3 b = 10 20 30 a_add_b = 11 21 31 12 22 32 13 23 33 a_expanded = 1 1 1 2 2 2 3 3 3 b_expanded = 10 20 30 10 20 30 10 20 30 a_add_b = 11 21 31 12 22 32 13 23 33
In other words, we can understand the sum of a column $a_i$ and a row $b_j$ as a matrix $c_{i,j} = a_i + b_j$, where $i$ is an index of a row, and $j$ is an index of a column.
Untill now all our examples were about the sum operation. Consider now examples for different operations:
[1 2 3] - 1
[1 2; 3 4] .* 2
[1 2 3] .* [1; 2; 3]
1 ./ (1:10)
eye(3) ./ [1 10 100]
ans = 0 1 2 ans = 2 4 6 8 ans = 1 2 3 2 4 6 3 6 9 ans = Columns 1 through 8: 1.00000 0.50000 0.33333 0.25000 0.20000 0.16667 0.14286 0.12500 Columns 9 and 10: 0.11111 0.10000 ans = 1.00000 0.00000 0.00000 0.00000 0.10000 0.00000 0.00000 0.00000 0.01000
The matrix multiplication *
and division /
, \
operations don't expand their operands, but there is a very usefull special case: they may be used with numbers:
2 * ones(5) # multiply all elements by 2 (the same as .*)
[1 2 3] / 3 # divide each element by 3 (the same as ./)
ans = 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ans = 0.33333 0.66667 1.00000
This is an important concept, because it is what differs octave programming from programming in another general purpose languages such as C/C++, Java, Python.
Consider, for example, the addition operation +
. You just write a + b
and all elements are added similtaneously. If you worked in C/C++ with two dimensionals arrays, you should have wrote a nested loop that iterated through rows and columns of the matrices to add elements. There are loops in Octave also (we will look at them later), but you can usually avoid them. This makes code easier to understand and even faster.
Let's look at the example, how one can avoid loops, using operations that we already know. Imagine, we have a row vector a
and we want to find differences between each consequative elements:
a = [1 3 10 13 14] # this is what we have
b = [2 7 3 1] # this is what we want to get
a = 1 3 10 13 14 b = 2 7 3 1
$2 = 3 - 1$, $7 = 10 - 3$, $3 = 13 - 10$, $1 = 14 - 13$. Note that size of b
is smaller than the size of a
by 1.
Look at the equalities in this paragraph. The first operands, from which we subtract are [3, 10, 13, 1]
, and the second operands, that we subtract are: [1, 3, 10, 13]
.
But that means, that we subtract from a
without the last element. And what we subtract is a
without the first element. That is, we can just write:
a(2:end) - a(1:end-1)
ans = 2 7 3 1
In fact, there is a built-in function for the same:
diff(a)
ans = 2 7 3 1
So, when you write a program and want to use a loop, that is to repeat some action several times, think at first, whether it is possible to avoid looping by means of broadcasting, indexing, etc. This is usually possible for loops with simple bodies.
If you want to recall a definition: https://www.mathsisfun.com/algebra/matrix-introduction.html.
a = [1 2; 3 4; 5 6]
a'
b = 1:5
b'
(1:5)' # the same as previuos, but without an extra variable
a = 1 2 3 4 5 6 ans = 1 3 5 2 4 6 b = 1 2 3 4 5 ans = 1 2 3 4 5 ans = 1 2 3 4 5
We do not use this in assignments, but it is a kind of obligatory to introduce these functions when learning Octave:
a = [1 2; 1 3]
a_inv = inv(a)
a * a_inv # the identity matrix eye(2)
det(a) # the determinant
a = 1 2 1 3 a_inv = 3 -2 -1 1 ans = 1 0 0 1 ans = 1
There are many more basic matrix operations: https://octave.org/doc/v4.4.0/Basic-Matrix-Functions.html
There is a number of mathematical functions in octave, such as sin
, log
, exp
(sine, logarithm, exponent), they work with each element separately:
x = [0 pi; 1 pi/2]
sin_x = sin(x)
exp(0:4)
x = 0.00000 3.14159 1.00000 1.57080 sin_x = 0.00000 0.00000 0.84147 1.00000 ans = 1.0000 2.7183 7.3891 20.0855 54.5982
The next operations return
a = [1 6 8; 5 4 9; 2 7 3]
sum(a)
min(a)
max(a)
a = 1 6 8 5 4 9 2 7 3 ans = 8 17 20 ans = 1 4 3 ans = 5 7 9
Remember, that matrices are stored in memory by columns, that is why the default operation of these functions is by columns.
But if a matrix is one dimentional, either a row or a column, these operations work with all elements of a matrix:
sum([2 1 3]) # although, we have 3 columns, the function returns only one result
min([2 1 3])
max([2 1 3])
ans = 6 ans = 1 ans = 3
How to get sums by rows instead of columns? (use transposition), and how to find the sum of all elements of a two dimentional matrix? (call sum
two times).
But you can also type help sum
to find out, how to get sums by rows instead of columns, and how to find the sum of all elements of a two dimentional matrix by just one call to sum
with special arguments.
Answer the same questions about min
and max
.
It is also possible to not only find minimums, but also find an exact position of minimal elements in each column. We will call a function in such a way, that it returns several results, we use square brackets for this before an assignment operator. Some functions may return several results.
a = [1 6 8; 5 4 9; 2 7 3]
[m, i] = min(a)
a = 1 6 8 5 4 9 2 7 3 m = 1 4 3 i = 1 2 3
m
are the minimal elements, and i
gives indices of these elements. The minimal element of the first column is in the row 1, of the second column in the row 2, of the third column in the row 3.
Consider the operation ==
. It compares values:
10 == 10
20 == 30
[10 20 30] == [10 5 30]
ans = 1 ans = 0 ans = 1 0 1
1 means true
, that is, the statement about equality is true. And 0 means false
. You see, that for matrices elements are compared element-by-element. Broadcasting also works here:
a = [1 2 3; 2 3 4; 3 4 5]
a_eq_3 = a == 3
a = 1 2 3 2 3 4 3 4 5 a_eq_3 = 0 0 1 0 1 0 1 0 0
There are other comparison operations: ~=
, >
, <
, >=
, <=
, that mean correspondingly: not equal, greater, less, greater or equal, less or equal:
a_not_eq_3 = a ~= 3
a_gr_3 = a > 3
a_gr_eq_3 = a >= 3
a_less_3 = a < 3
a_less_eq_3 = a <= 3
a_not_eq_3 = 1 1 0 1 0 1 0 1 1 a_gr_3 = 0 0 0 0 0 1 0 1 1 a_gr_eq_3 = 0 0 1 0 1 1 1 1 1 a_less_3 = 1 1 0 1 0 0 0 0 0 a_less_eq_3 = 1 1 1 1 1 0 1 0 0
There are logical operations and (&
), or (|
), not (~
). If you write x & y
, both operands must be true for the result to also be true:
0 & 0
1 & 0
0 & 1
1 & 1
ans = 0 ans = 0 ans = 0 ans = 1
If you write x | y
, at least one operand must be true for the result to also be true:
0 | 0
1 | 0
0 | 1
1 | 1
ans = 0 ans = 1 ans = 1 ans = 1
Not operation inverts its operand:
~0
~1
ans = 1 ans = 0
It is usual to combine comparison and logical operations. Note that for matrices boolean operations work element-by-element:
a = [1 2 3; 2 3 4; 3 4 5]
a_from_2_to_4 = a >= 2 & a <= 4
a_not_in_2_3 = a < 2 | a > 3
a = 1 2 3 2 3 4 3 4 5 a_from_2_to_4 = 0 1 1 1 1 1 1 1 0 a_not_in_2_3 = 1 0 0 0 0 1 0 1 1
Imagine, we want to get all elements of some vector, that are greater than 10. Let's start with finding which elements are greater, and which are not:
x = [1 10 2 20 3 30 40]
x_gr_10 = x > 10
x = 1 10 2 20 3 30 40 x_gr_10 = 0 0 0 1 0 1 1
Now we can use a find
function. This function takes a matrix and find indexes of all nonzero elements:
find([10 0 0 3 0 0 -4]) # just an arbitrary example of searching for non-zero elements
find(x_gr_10)
ans = 1 4 7 ans = 4 6 7
The find
function combines well with logical matrices, because it allows finding indexes of all true values. More examples on this:
find([100 200 300] == 200) # the 2nd element is 200
find([100 200 300] >= 200) # the 2nd and the 3rd elements are greater or equal than 200
ans = 2 ans = 2 3
So, now, we can finally get al elements of x
that are greater than ten
x # to remind the value of x
x_gr_10 = x > 10
find(x_gr_10)
x(find(x_gr_10)) # we know indexes of elements, so we can just index by that indexes:
x = 1 10 2 20 3 30 40 x_gr_10 = 0 0 0 1 0 1 1 ans = 4 6 7 ans = 20 30 40
Let's now write it as one line and let's also modify elements:
x(find(x > 10)) # see all elements, that are greater than 10
x(find(x > 10)) = 10 # assign 10 to each element
ans = 20 30 40 x = 1 10 2 10 3 10 10
More examples of logical operations inside indexing:
x = 1:16
x_div_3 = x(find(mod(x, 3) == 0)) # filter elements that are divisible by 3
x_from_5_to_10 = x(find(5 <= x & x <= 10)) # all elements from 5 to 10
# cross matrix indexing
a = [10 20 30]
b = [100 200 300]
# we find elements from a, that are greater than 20, and take corresponding elements from b:
b(find(a >= 20))
# remember, that you may assign to an indexed matrix
# in this case you assign only to indexed elements
x(find(mod(x, 3) == 0)) = x_div_3 + 1
# so, all elements, that are divisible by 3 increased by 1
x = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 x_div_3 = 3 6 9 12 15 x_from_5_to_10 = 5 6 7 8 9 10 a = 10 20 30 b = 100 200 300 ans = 200 300 x = 1 2 4 4 5 7 7 8 10 10 11 13 13 14 16 16
Let's now repeat the same examples without find
. The idea is, that when you index by a logical matrix, the find is made automatically (!)
# x = 1:16
x_div_3 = x(mod(x, 3) == 0) # the same as x(find(mod(x, 3) == 0))
x_from_5_to_10 = x(5 <= x & x <= 10)
# cross matrix indexing
a = [10 20 30]
b = [100 200 300]
b(a >= 20)
x(mod(x, 3) == 0) = x_div_3 + 1
x_div_3 = [](1x0) x_from_5_to_10 = 5 7 7 8 10 10 a = 10 20 30 b = 100 200 300 ans = 200 300 x = 1 2 4 4 5 7 7 8 10 10 11 13 13 14 16 16