Arrays the typical data structure used in data science a statistical analysis. Arrays in NumPy are representes by objects called ndarrays which allows data storage, manipulation, processing and modeling.
This first tutorial is intended to show you the basic operations and used of ndarrays using NumPy
import numpy as np
ndarrays provides information about its structure, shape and data content.
a = np.array([[1,2,3],[4,5,6]])
print "a=\n",a,"\n"
print "a has shape:", a.shape
print "a has", a.ndim,"dimensions"
print "a has",a.size,"elements"
print "a has type:",a.dtype
print "a occupies",a.itemsize,"Bytes"
a= [[1 2 3] [4 5 6]] a has shape: (2, 3) a has 2 dimensions a has 6 elements a has type: int64 a occupies 8 Bytes
a = np.array([1,2,3]) # a row vector
print "\nRow vector:\n",a
a = np.array([[1],[2],[3]]) # a column vector
print "\nColumn vector:\n",a
a = np.empty((4,4)) #4x4 without initialization
print "\n4x4 uninitialized matrix:\n",a
a = np.zeros((4,4,)) #4x4 with zeros
print "\n4x4 matrix filled with zeros:\n",a
a = np.ones((4,4)) #4x4 with ones
print "\n4x4 matrix filled with ones\n",a
a = np.eye(4,4) #Square identity 5x5 matrix
print "\n4x4 identity matrix\n",a
a = np.random.rand(4,4) #4x4 with pseudo-random values
print "\n4x4 matrix filled with random values\n",a
a = np.arange(16) #0-15 sequence vector
print "\na vector sequence\n",a
a = np.reshape(a,(4,4)) #resized as 4x4 matrix
print "\nnow as matrix\n",a
a = np.arange(16).reshape(4,4) #a shorter way
print "\nsame matrix:\n",a
a = np.linspace(1,10,19) #sequence of 19 linearly spaced values from 0 to 2pi
print "\na linspace vector:\n",a
a = np.logspace(0,10,11,base=2.0) #sequence of 11 values from 2^0 to 2^10 evenly spaced values in a logaritmic scale
print "\na logspace vector:\n",a
Row vector: [1 2 3] Column vector: [[1] [2] [3]] 4x4 uninitialized matrix: [[4.9e-324 1.5e-323 nan 9.9e-324] [ nan 2.0e-323 2.5e-323 nan] [4.9e-324 4.9e-324 4.9e-324 4.9e-324] [9.9e-324 9.9e-324 9.9e-324 9.9e-324]] 4x4 matrix filled with zeros: [[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]] 4x4 matrix filled with ones [[1. 1. 1. 1.] [1. 1. 1. 1.] [1. 1. 1. 1.] [1. 1. 1. 1.]] 4x4 identity matrix [[1. 0. 0. 0.] [0. 1. 0. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]] 4x4 matrix filled with random values [[0.32263067 0.6615064 0.7554921 0.12567354] [0.78957221 0.15694903 0.00148748 0.0875882 ] [0.60052177 0.14539078 0.27666647 0.16746284] [0.80331606 0.12192216 0.46427821 0.95304171]] a vector sequence [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15] now as matrix [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15]] same matrix: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15]] a linspace vector: [ 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 6. 6.5 7. 7.5 8. 8.5 9. 9.5 10. ] a logspace vector: [1.000e+00 2.000e+00 4.000e+00 8.000e+00 1.600e+01 3.200e+01 6.400e+01 1.280e+02 2.560e+02 5.120e+02 1.024e+03]
NumPy allows to manipulate and operate arrays structures as whole; unlike other languages, such as C++ or Java, wich are forced to access arrays element-by-element.
Is similar to work with matrices in MATLAB, however, NumPy also makes easier broadcast and element-wise operations as shown in the following code.
a = np.array([[1,3,-2],[-5,4,5]])
print "a=\n",a
b = np.array([[1,1,1],[2,2,2]])
### Broadcast operations ###
print "\na+1=\n",a+1
print "\na/2=\n",a/2.0
print "\na**2=\n",a**1
print "\nexp(a)=\n",np.exp(a)
### Some built-in operations ###
print "\nmean:\n", a.mean()
print "\nstd deviation:\n", a.std()
print "\nmax:\n", a.max()
print "\nmin:\n", a.min()
print "\nsum of elements:\n", a.sum()
a.sort(axis=1)
print "\nsorted:\n", a
a= [[ 1 3 -2] [-5 4 5]] a+1= [[ 2 4 -1] [-4 5 6]] a/2= [[ 0.5 1.5 -1. ] [-2.5 2. 2.5]] a**2= [[ 1 3 -2] [-5 4 5]] exp(a)= [[2.71828183e+00 2.00855369e+01 1.35335283e-01] [6.73794700e-03 5.45981500e+01 1.48413159e+02]] mean: 1.0 std deviation: 3.5118845842842465 max: 5 min: -5 sum of elements: 6 sorted: [[-2 1 3] [-5 4 5]]
Elements or subsets of NumPy arrays can also be accessed using square brackets "[ ]". Unlike MATLAB, NumPy arrays starts at '0' and ends in 'n-1', being 'n' the length of the array.
a = np.random.randint(0,9,size=(5,5))
print "a=\n",a
### Accessing rows ###
print "\nall elememts in 'a' second row:\n",a[1,:]
### Accessing colums ###
print "\nall elememts in 'a' third column:\n",a[:,2]
### First element ###
print "\nall First element: \n",a[0,0]
### Last element ###
print "\nLast elements:\n",a[-1,-1]
### 3x3 subset ###
print "\nmiddle 3x3 subset:\n",a[1:4,1:4]
### Masking ###
print "\nelements greater than 5:\n",a[a>5]
a= [[0 8 6 0 5] [5 5 6 2 5] [0 4 8 5 4] [1 0 1 3 6] [8 1 8 0 6]] all elememts in 'a' second row: [5 5 6 2 5] all elememts in 'a' third column: [6 6 8 1 8] all First element: 0 Last elements: 6 middle 3x3 subset: [[5 6 2] [4 8 5] [0 1 3]] elements greater than 5: [8 6 6 8 6 8 8 6]
Arrays can be concatenated with other arrays compatible in size. You can also split one array in two, or change its shape keeping the same number of elements.
a = np.array([[1,3,-2,2],[-5,4,5,-1]])
print "a=\n",a
b = np.array([[1,1,1,1],[2,2,2,2]])
print "\nb=\n",b
print "\nConcatenate 'a','b' along rows: \n",np.hstack((a,b))
print "\nConcatenate 'a','b' along columns: \n",np.vstack((a,b))
print "\nSplit 'a' along columns: \n",np.hsplit(a,2)[0],"\nand\n",np.hsplit(a,2)[1]
print "\nSplit 'b' along rows: \n",np.vsplit(b,2)[0],"and",np.vsplit(b,2)[1]
a= [[ 1 3 -2 2] [-5 4 5 -1]] b= [[1 1 1 1] [2 2 2 2]] Concatenate 'a','b' along rows: [[ 1 3 -2 2 1 1 1 1] [-5 4 5 -1 2 2 2 2]] Concatenate 'a','b' along columns: [[ 1 3 -2 2] [-5 4 5 -1] [ 1 1 1 1] [ 2 2 2 2]] Split 'a' along columns: [[ 1 3] [-5 4]] and [[-2 2] [ 5 -1]] Split 'b' along rows: [[1 1 1 1]] and [[2 2 2 2]]
a = np.random.randint(0,9,size=(5,5))
print "a=\n",a
a_t = a.T
print "\na transpose\n",a_t
a_inv = np.linalg.inv(a)
print "\na inverse\n", a_inv
mat_prod = np.matmul(a,a_inv)
print "\nmatricial product (a x a⁻¹ = identity)\n", np.round(mat_prod)
a_det = np.linalg.det(a)
print "\na determinant\n", a_det
a= [[4 5 3 0 6] [0 6 2 6 3] [0 7 3 0 3] [0 0 0 3 8] [2 1 0 0 0]] a transpose [[4 0 0 0 2] [5 6 7 0 1] [3 2 3 0 0] [0 6 0 3 0] [6 3 3 8 0]] a inverse [[ 0.13392857 0.02678571 -0.15178571 -0.05357143 0.23214286] [-0.26785714 -0.05357143 0.30357143 0.10714286 0.53571429] [ 0.64880952 0.19642857 -0.44642857 -0.39285714 -1.29761905] [ 0.06349206 0.19047619 -0.19047619 -0.04761905 -0.12698413] [-0.02380952 -0.07142857 0.07142857 0.14285714 0.04761905]] matricial product (a x a⁻¹ = identity) [[ 1. 0. -0. 0. 0.] [-0. 1. -0. 0. 0.] [-0. 0. 1. -0. 0.] [-0. 0. 0. 1. 0.] [ 0. 0. -0. -0. 1.]] a determinant 1008.0000000000002
Solve following system equation
x1+2x2+x3−x4=53x1+2x2+4x3+4x4=164x1+4x2+3x3+4x4=222x1+x3+5x4=15
Which can be represented with the following matrices
A=[121−1324444342035] b=[5162215]
Where
Ax=b
so
x=A−1b
###YOUR CODE
β=(XTX)−1XTy
import matplotlib.pyplot as plt
x = np.linspace(0,10,100)
y = 3.5*x + 5 + np.random.randn(100)
plt.plot(x,y,'.')
plt.show()
#YOUR CODE