## an array looks like a Python list that has different shapes¶

• That is why you see the square brackets
• Array is just a group of items of the same type
In :
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline


## Importing into array /exporting array¶

• np.savetxt('./data/ar1.txt',ar1,delimiter=' ')

## Convert from pandas¶

• df.values
Tip: to python list ar.to_list()

## Creating arrays from scratch¶

### 1D¶

• np.array([1,2,3, 10])
• np.zeros(how_many)
• np.ones(how_many)
• np.full((what), how_many)

### np.linspace¶

arithmetics interval = (end - start)/(n -1).
For example, (10-1)/(4-1) = 3

### any dimension¶

np.arange(some_size).reshape(some_shape)

Tip: np.arange is similar to np.linspace with following 3 differences: 1. end point is not included; 2. you specify the interval; 3.np.linspace returns array of floats whereas np,.arange returns integers

### np.random¶

• np.random.rand(n,m) retarts array with floats between 0-1
• np.random.rand(n,m)*k retarts array with floats between 0-k
• np.random.randint(start, end, size=(n,m)) If we do not specify size =, it will return 1 random integer between start and end; np.random.randint(0,10,size=(2,3))
• np.random.randn(n,m) returns array with floats from standard normal distribution with mean 0 and standard deviation of 1
• np.random.randn(n,m) * sigma + mu, returns array with floats shifted and scaled from the standard normal distribution

### Inspecting array¶

• ar.size | number of elements in ar
• ar.shape | dimensions of ar (rows,columns)
• ar.ndim | number of dimensions
• ar.dtype | type of elements in ar
• ar.astype(dtype) | Convert arr elements to type dtype

## Copying/sorting/reshaping¶

• np.copy(ar) | copies to new memory, i.e. deep copy.
             In comparison,ar2 = ar1 is a shallow copy because changing ar1 will change ar2, and vice versa!  Because shallow copy is just giving the original array another name, nothing else. 
• ar.sort() | sort

• ar.sort(axis=0)

• two_d_ar.flatten() | Flattens 2D aray to 1D
• ar.T | Transposes ar (rows become columns and vice versa)
• ar.reshape(3,4) | Reshapes ar to 3 rows, 4 columns without changing data
• ar.resize((5,6)) | Changes ar shape to 5x6 and fills new values with 0

• ar.view(dtype) | Creates view of ar elements with type dtype

Tip: b = array.copy() is a deep copy (i.e. new memory) This is very different from python list. Copied python list is not a deep copy unless you make it a deep copy using import copy.

## Combining/splitting¶

• np.concatenate((ar1,ar2),axis=0) | Adds ar2 as rows to the end of ar1
• np.concatenate((ar1,ar2),axis=1) | Adds ar2 as columns to end of ar1
• np.split(ar,3) | Splits ar into 3 sub-arays
• np.hsplit(ar,5) | Splits ar horizontally on the 5th index

• np.append(arr,values) | Appends values to end of arr
• np.insert(arr,2,values) | Inserts values into arr before index 2
• np.delete(arr,3,axis=0) | Deletes row on index 3 of arr
• np.delete(arr,4,axis=1) | Deletes column on index 4 of arr

## Indexing/slicing/subsetting¶

• ar | the element at index 5
• ar[2,5] | element on index 
• ar=4 | Assigns aray element on index 1 the value 4
• ar[0:3] | Returns the elements at indices 0,1,2 (On a 2D aray: returns rows 0,1,2)
• ar[0:3,4] | Returns the elements on rows 0,1,2 at column 4
• ar[:2] | Returns the elements at indices 0,1 (On a 2D aray: returns rows 0,1)
• ar[:,1] | Returns the elements at index 1 on all rows
• ar\<5 | Returns an aray with boolean values
• (ar1\<3) & (ar2\>5) | Returns an aray with boolean values
• ~ar | Inverts a boolean aray
• ar[ar<5] | boolean slicing

## Importing into array /exporting array¶

#### txt file (data separated with space)¶

In :
#data1.txt just have 6 numbers, separated with spaces, and a line breaker
ar1

Out:
array([[1., 2., 3.],
[4., 5., 6.]])
In :
#save txt file
np.savetxt('./data/ar1.txt',ar1,delimiter=' ')

In :
#re-import it to verfiy it is working
ar1

Out:
array([[1., 2., 3.],
[4., 5., 6.]])

• df.values

## Convert to list¶

• arr.tolist() | Convert arr to a Python list
In :
df = pd.read_csv('./data/data2.csv', header=None)
df

Out:
0 1 2 3
0 6 2 3 1
1 4 5 4 3
In :
numpy_matrix = df.values
numpy_matrix

Out:
array([[6, 2, 3, 1],
[4, 5, 4, 3]], dtype=int64)
In :
df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df

Out:
A B
0 1 3
1 2 4
In :
ar = df.values
ar

Out:
array([[1, 3],
[2, 4]], dtype=int64)

# Convert to python list¶

• we already know numpy.array converts python list to numpy array
• arr.tolist() | Convert arr to a Python list
In :
ar = np.array([[1,100],[2,1]])
ar

Out:
array([[  1, 100],
[  2,   1]])
In :
l = ar.tolist()
l

Out:
[[1, 100], [2, 1]]
In :
type(l)

Out:
list

## From Scrach¶

### 1D¶

• np.array([1,2,3, 10])
• np.zeros(how_many)
• np.ones(how_many)
• np.full((what), how_many)
In :
np.array([1,7,90,1000.])

Out:
array([   1.,    7.,   90., 1000.])
In :
np.zeros(100)

Out:
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
In :
np.ones(10)

Out:
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
In :
np.full((1),1)

Out:
array()

### np.array([(), ()]), or np.array([[ , , ]])¶

In :
np.zeros((10,10))

Out:
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
In :
np.ones((10,10))

Out:
array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])
In :
np.full((2,3),1000.1)

Out:
array([[1000.1, 1000.1, 1000.1],
[1000.1, 1000.1, 1000.1]])
In :
np.eye(3)

Out:
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
In :
np.array([(1,2),(3,4)])

Out:
array([[1, 2],
[3, 4]])
In :
np.array([(1,7),(6,9)]).shape

Out:
(2, 2)
In :
np.array([(1,2),(3,0)])

Out:
array([[1, 2],
[3, 0]])
In :
np.array([[1,2,3,4],[1,4,3]]).shape

Out:
(2,)
In :
np.array([[1,2,3,4],[5,6,7,8]]).shape

Out:
(2, 4)
In :
np.array([[1,2,3,4]])

Out:
array([[1, 2, 3, 4]])
In :
np.array([(1,2),(3,4)])

Out:
array([[1, 2],
[3, 4]])
In :
# nested lists result in multi-dimensional arrays
np.array([range(i, i + 3) for i in [1, 3, 1]])

Out:
array([[1, 2, 3],
[3, 4, 5],
[1, 2, 3]])

## np.linspace ¶

In :
#np.linspace arithmetics interval = (end - start)/(n -1).
# For example, (10-1)/(4-1) = 3
np.linspace(1,2,4)

Out:
array([1.        , 1.33333333, 1.66666667, 2.        ])
In :
#interval = (11-1)/(6-1)= 2
np.linspace(1,11,6)

Out:
array([ 1.,  3.,  5.,  7.,  9., 11.])

### np.arange ¶

In :
# np.arange is similar to np.linspace with following 3 differences:
# 1. end point is not included;  2. you specify the interval; 3.np.linspace returns array of floats whereas np,.arange returns integers
# in general, I prefer np.arange instead of np.linspace
np.arange(0,10,3)

Out:
array([0, 3, 6, 9])

### np.random ¶

• np.random.rand(n,m) retarts array with floats between 0-1
• np.random.rand(n,m)*k retarts array with floats between 0-k
• np.random.randint(start, end, size=(n,m)) If we do not specify size =, it will return 1 random integer between start and end; np.random.randint(0,10,size=(2,3))
• np.random.randn(n,m) returns array with floats from standard normal distribution with mean 0 and standard deviation of 1
• np.random.randn(n,m) * sigma + mu, returns array with floats shifted and scaled from the standard normal distribution
In :
np.random.rand(2,3) # 2X3 array of random floats between 0–1

Out:
array([[0.57256735, 0.50921722, 0.11257014],
[0.17108578, 0.62249645, 0.52027419]])
In :
np.random.rand(2,3)*100 # 2X3 array of random floats between 0–100

Out:
array([[80.97626817, 84.65068553, 28.65173649],
[62.53292734, 71.53976566, 47.46193707]])
In :
np.random.randint(2,100)

Out:
66
In :
ar = np.random.randn(10)*10+1
plt.hist(ar, bins = np.arange(-30,31,3))

Out:
(array([0., 0., 0., 0., 0., 0., 0., 4., 2., 2., 0., 1., 0., 1., 0., 0., 0.,
0., 0., 0.]),
array([-30, -27, -24, -21, -18, -15, -12,  -9,  -6,  -3,   0,   3,   6,
9,  12,  15,  18,  21,  24,  27,  30]),
<a list of 20 Patch objects>) In :
ar.std() # Returns the standard deviation of the array elements along given axis.

Out:
5.758782594543135
In :
ar = np.random.randn(1000)*10+1

In :
plt.hist(ar, bins = np.arange(-30,31,3))

Out:
(array([  1.,   5.,   8.,  20.,  30.,  33.,  72.,  82.,  93., 118., 120.,
106., 117.,  81.,  52.,  28.,  13.,  15.,   5.,   1.]),
array([-30, -27, -24, -21, -18, -15, -12,  -9,  -6,  -3,   0,   3,   6,
9,  12,  15,  18,  21,  24,  27,  30]),
<a list of 20 Patch objects>) In :
ar.std()

Out:
9.75399945467625

## Inspecting array¶

• ar.size | Returns number of elements in array
• ar.shape | Returns dimensions of array (rows,columns)
• ar.dtype | Returns type of elements in array
• ar.astype(dtype) | Convert arr elements to type dtype

In :
ar = np.random.randint(0,10, size=(2,2))
ar

Out:
array([[2, 1],
[0, 7]])
In :
ar.size

Out:
4
In :
ar.shape

Out:
(2, 2)
In :
ar.ndim

Out:
2
In :
ar.dtype

Out:
dtype('int32')
In :
ar.astype('float')

Out:
array([[2., 1.],
[0., 7.]])

## Copying/sorting/reshaping ¶

• np.copy(ar) | copies to new memory
• ar.sort() | sort
• ar.sort(axis=0)
• two_d_ar.flatten() | Flattens 2D aray to 1D
• ar.T | Transposes ar (rows become columns and vice versa)
• ar.reshape(3,4) | Reshapes ar to 3 rows, 4 columns without changing data
• ar.resize((5,6)) | Changes ar shape to 5x6 and fills new values with 0

• ar.view(dtype) | Creates view of ar elements with type dtype

In :
ar

Out:
array([[2, 1],
[0, 7]])
In :
b = ar.copy()
b

Out:
array([[2, 1],
[0, 7]])

#### shallow copy¶

In :
c =ar
c

Out:
array([[1000,    1],
[   0,    7]])
In :
ar[0,0]=1000
ar

Out:
array([[1000,    1],
[   0,    7]])
In :
# c is changed as it is a shallow copy of ar
c

Out:
array([[1000,    1],
[   0,    7]])

#### deep copy (note this would have been a shallow copy if ar is a python list and not a numpy array)¶

In :
b = ar.copy()

In :
ar[0,0] = 0
ar

Out:
array([[0, 1],
[0, 7]])
In :
b

Out:
array([[1000,    1],
[   0,    7]])
In :
c

Out:
array([[0, 1],
[0, 7]])

#### Note that shallow copy acts as if it is the original except that it's got anther name.¶

Changing the shallow copy will change the original array

In :
c[0,1] = 999
c

Out:
array([[  0, 999],
[  0,   7]])
In :
ar #note that its [0,1] element is also changed!

Out:
array([[  0, 999],
[  0,   7]])

a.sort(axis=-1, kind='quicksort', order=None)

In [ ]:



## Combining/splitting ¶

• np.concatenate((ar1,ar2),axis=0) | Adds ar2 as rows to the end of ar1
• np.concatenate((ar1,ar2),axis=1) | Adds ar2 as columns to end of ar1
• np.split(ar,3) | Splits ar into 3 sub-arays
• np.hsplit(ar,5) | Splits ar horizontally on the 5th index

In :
dr = np.concatenate((ar,ar,ar,ar),axis=0)
np.concatenate((dr,dr,dr,dr), axis=1)

Out:
array([[10, 10, 10, 10, 10, 10, 10, 10],
[ 8,  6,  8,  6,  8,  6,  8,  6],
[10, 10, 10, 10, 10, 10, 10, 10],
[ 8,  6,  8,  6,  8,  6,  8,  6],
[10, 10, 10, 10, 10, 10, 10, 10],
[ 8,  6,  8,  6,  8,  6,  8,  6],
[10, 10, 10, 10, 10, 10, 10, 10],
[ 8,  6,  8,  6,  8,  6,  8,  6]])
In :
np.split(ar,2)

Out:
[array([[10, 10]]), array([[8, 6]])]
In :
np.hsplit(ar,2)

Out:
[array([,
[ 8]]), array([,
[ 6]])]
In :
np.concatenate(np.hsplit(ar,2), axis=1)

Out:
array([[10, 10],
[ 8,  6]])

• np.append(arr,values) | Appends values to end of array
• np.insert(arr,2,values) | Inserts values into arr before index 2
• np.delete(arr,3,axis=0) | Deletes row on index 3 of arr
• np.delete(arr,4,axis=1) | Deletes column on index 4 of arr

In :
ar

Out:
array([[10, 10],
[ 8,  6]])
In :
#if you don't specify axis, then the result will be flattened
np.append(ar,[5,5])

Out:
array([10, 10,  8,  6,  5,  5])
In :
#if you specify axis, then you must provide exactly the same shape of array(s)
np.append(ar,[[5,5],[5,5]], axis=0)

Out:
array([[10, 10],
[ 8,  6],
[ 5,  5],
[ 5,  5]])
Tip: If you don't specify axis, then the result from np.append will be flattened. If you specify axis, then you must provide exactly the same shape of array(s)
In :
np.append(ar,[[5,5],[5,5]], axis=1)

Out:
array([[10, 10,  5,  5],
[ 8,  6,  5,  5]])
In :
a = np.array([[1, 1], [2, 2], [3, 3]])
a

Out:
array([[1, 1],
[2, 2],
[3, 3]])
Tip: np.insert(arr, obj, values, axis=None) obj=: Object that defines the index or indices before which values is inserted.
In :
#np.insert(arr, obj, values, axis=None)
# obj=: Object that defines the index or indices before which values is inserted.
# in this example, 0 means the 0th index
np.insert(a, 0, 5)

Out:
array([5, 1, 1, 2, 2, 3, 3])
In :
np.insert(a, -1, 5)

Out:
array([1, 1, 2, 2, 3, 5, 3])
In :
np.array([1,2,3]).shape

Out:
(3,)
In :
np.array([,,]).shape

Out:
(3, 1)
In :
# in this example,  means to insert the 1D array as the very first column
np.insert(a, ,[,,], axis=1)

Out:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In :
# in this example,  means to insert the 1D array as the second column
np.insert(a, , [,,], axis=1)

Out:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In :
a

Out:
array([[1, 1],
[2, 2],
[3, 3]])
In :
b = a.flatten()
b

Out:
array([1, 1, 2, 2, 3, 3])
In :
b

Out:
array([1, 1, 2, 2, 3, 3])
In :
np.insert(b, slice(2, 4), [5, 6])

Out:
array([1, 1, 5, 2, 6, 2, 3, 3])
In :
np.arange(8)

Out:
array([0, 1, 2, 3, 4, 5, 6, 7])
In :
x = np.arange(8).reshape(2, 4)
x

Out:
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
In :
idx = (1, 3) #this is row No. 2, column No. 3
np.insert(x, idx, 999, axis=1)

Out:
array([[  0, 999,   1,   2, 999,   3],
[  4, 999,   5,   6, 999,   7]])

## Indexing/slicing/subsetting¶

• ar | Returns the element at index 5
• ar[2,5] | Returns the 2D aray element on index 
• ar=4 | Assigns aray element on index 1 the value 4
• ar[1,3]=10 | Assigns aray element on index  the value 10
• ar[0:3] | Returns the elements at indices 0,1,2 (On a 2D aray: returns rows 0,1,2)
• ar[0:3,4] | Returns the elements on rows 0,1,2 at column 4
• ar[:2] | Returns the elements at indices 0,1 (On a 2D aray: returns rows 0,1)
• ar[:,1] | Returns the elements at index 1 on all rows
• ar\<5 | Returns an aray with boolean values
• (ar1\<3) & (ar2\>5) | Returns an aray with boolean values
• ~ar | Inverts a boolean aray
• ar[ar<5] | Returns aray elements smaller than 5

## More examples ¶

In :
#array of 10 zeros
np.zeros(10, dtype=int)

Out:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In :
np.zeros((2,3),dtype=float)

Out:
array([[0., 0., 0.],
[0., 0., 0.]])
In :
np.full((10,10),1)

Out:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
In :
np.ones((10,10), dtype=int)

Out:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

### Ex. 1 long - wide¶

• long duplicates itself horizontally to match wide's width
• wide duplicates itself vertically to match long's width
In :
X = np.ones((2,1), dtype=int)
X

Out:
array([,
])
In :
Y = np.ones((1,2), dtype=int)
Y

Out:
array([[1, 1]])
In :
X - Y

Out:
array([[0, 0],
[0, 0]])

### Ex. 2 long - wide¶

• long duplicates itself horizontally to match wide's width
• wide duplicates itself vertically to match long's width
In :
X = np.full((2,1),2)
X

Out:
array([,
])
In :
Y = np.full((1,2),1)
Y

Out:
array([[1, 1]])
In :
X - Y

Out:
array([[1, 1],
[1, 1]])

### Ex. 3 long - wide¶

• long duplicates itself horizontally to match wide's width
• wide duplicates itself vertically to match long's width
In :
X = np. array([,])
X

Out:
array([,
])
In :
Y = np.array([[2, 1]])
Y

Out:
array([[2, 1]])
In :
X -Y

Out:
array([[-1,  0],
[ 0,  1]])

## Tricking it into doing something with evey row¶

If we think of each row is a point on 2-D space (like a sheet of paper), if we want to get its distance from all other points, including itself,which we called X here,

then we reshape a copy of it into 3-D space, which we call Y. So when we take the difference between them, X will be duplicated along the 3rd dimension.

The trick is that we do not reshape Y in (2,2,1). Rather, we reshape Y in (2,1,2).

In the first 2D space, X is (2,2) whereas Y is (2,1). So Y has to duplicate itself to become (2,2).

In the last dimension, X has to duplicate itself for Y.

In :
X = np.array([[1,0],
[2,1]])
X

Out:
array([[1, 0],
[2, 1]])
In :
Y = X.reshape(2,1,2)
Y

Out:
array([[[1, 0]],

[[2, 1]]])
In :
#[[0,0] ,[-1,-1]] = [[1, 0]] - [[1, 0],[2, 1]]
#[[ 1,  1],[ 0,  0]]] = [[2, 1]] - [[1, 0],[2, 1]]
Y-X

Out:
array([[[ 0,  0],
[-1, -1]],

[[ 1,  1],
[ 0,  0]]])

#### Let's check to see if we can replicate what numpy did¶

In :
np.array([[1, 0]]) - np.array([[1, 0],[2, 1]])

Out:
array([[ 0,  0],
[-1, -1]])
In :
np.array([[2, 1]]) - np.array([[1, 0],[2, 1]])

Out:
array([[1, 1],
[0, 0]])
In :
np.hstack((np.array([[1, 0]]) - np.array([[1, 0],[2, 1]]), np.array([[2, 1]])) - np.array([[1, 0],[2, 1]])  )

Out:
array([array([[-1, -1],
[-2, -2]]), array([[2, 1]]),
array([[-2, -2],
[-3, -3]]), array([[1, 0]])], dtype=object)