NumPy is a python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.
NumPy stands for Numerical Python.
Numpy is also incredibly fast, as it has bindings to C libraries. For more info on why you would want to use Arrays instead of lists, check out this great
It is highly recommended you install Python using the Anaconda distribution to make sure all underlying dependencies (such as Linear Algebra libraries) all sync up with the use of a conda install. If you have Anaconda, install NumPy by going to your terminal or command prompt and typing:
conda install numpy
If you do not have Anaconda and can not install it, please refer to Numpy's official documentation on various installation instructions.
my_list= [9,8,7]
my_list
[9, 8, 7]
import numpy as np
np.array(my_list)
array([9, 8, 7])
list1=[1,2,3,4,5]
np.array(list1)
array([1, 2, 3, 4, 5])
matrix=[[1,2,3],[4,5,6],[6,7,7]]
print(matrix)
print ('\n')
print (np.array(matrix))
[[1, 2, 3], [4, 5, 6], [6, 7, 7]] [[1 2 3] [4 5 6] [6 7 7]]
There are lots of built-in ways to generate Arrays
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))
[1 2 3 4 5] <class 'numpy.ndarray'>
np.arange(0,20)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
np.arange(0,50,3)
array([ 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48])
Return evenly spaced numbers over a specified interval.
np.linspace (0,50,6)
array([ 0., 10., 20., 30., 40., 50.])
np.linspace (0,10,40)
array([ 0. , 0.25641026, 0.51282051, 0.76923077, 1.02564103, 1.28205128, 1.53846154, 1.79487179, 2.05128205, 2.30769231, 2.56410256, 2.82051282, 3.07692308, 3.33333333, 3.58974359, 3.84615385, 4.1025641 , 4.35897436, 4.61538462, 4.87179487, 5.12820513, 5.38461538, 5.64102564, 5.8974359 , 6.15384615, 6.41025641, 6.66666667, 6.92307692, 7.17948718, 7.43589744, 7.69230769, 7.94871795, 8.20512821, 8.46153846, 8.71794872, 8.97435897, 9.23076923, 9.48717949, 9.74358974, 10. ])
Generate arrays of zeros or ones
np.zeros(9)
array([0., 0., 0., 0., 0., 0., 0., 0., 0.])
np.zeros((3,4))
array([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]])
np.ones(4)
array([1., 1., 1., 1.])
np.ones((3,4))
array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]])
Creates an identity matrix
np.eye(5)
array([[1., 0., 0., 0., 0.], [0., 1., 0., 0., 0.], [0., 0., 1., 0., 0.], [0., 0., 0., 1., 0.], [0., 0., 0., 0., 1.]])
np.eye(4,5)
array([[1., 0., 0., 0., 0.], [0., 1., 0., 0., 0.], [0., 0., 1., 0., 0.], [0., 0., 0., 1., 0.]])
Create an array of the given shape and populate it with
random samples from a uniform distribution
over [0, 1)
.
np.random.rand(3)
array([0.7489597 , 0.78917258, 0.70866872])
np.random.rand(4,4)
array([[0.54892249, 0.34257905, 0.0378045 , 0.94850873], [0.64952372, 0.60916155, 0.02955791, 0.22780147], [0.23216998, 0.41829629, 0.53139821, 0.33340072], [0.51921418, 0.72563794, 0.38172859, 0.32186366]])
Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:
np.random.randn(2)
array([ 0.38705145, -0.00165486])
np.random.randn(2,3)
array([[ 0.28120086, 0.68961856, -1.08746915], [-0.8188998 , -0.94987156, 0.39145483]])
Return random integers from low
(inclusive) to high
(exclusive).
np.random.randint(1,7)
1
np.random.randint(1,100,10)
array([53, 1, 5, 95, 28, 22, 76, 12, 45, 61])
np.random.randint (1,111)
15
Have a look some of the main attributes and methods or an array
arr = np.arange(50)
ranarr=np.random.randint (0,100,10)
arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])
ranarr
array([60, 44, 98, 38, 27, 97, 75, 53, 10, 40])
Covert existing array into new shape with same data.
arr.reshape(5,10)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])
arr.reshape (2,25)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24], [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])
arr1= np.arange(24)
arr1
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])
arr1.reshape(3,2,4)
array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7]], [[ 8, 9, 10, 11], [12, 13, 14, 15]], [[16, 17, 18, 19], [20, 21, 22, 23]]])
These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax
arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])
arr.min()
arr.max()
49
ranarr
array([60, 44, 98, 38, 27, 97, 75, 53, 10, 40])
ranarr.max()
98
ranarr.argmax()
2
ranarr.argmin()
8
The shape property is usually used to get the current shape of an array, but may also be used to reshape the array in-place by assigning a tuple of array dimensions to it
# Shape is an attribute that array have
# Vector
arr.shape
(50,)
# Notice the the two set of array
arr.reshape(10,5)
array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29], [30, 31, 32, 33, 34], [35, 36, 37, 38, 39], [40, 41, 42, 43, 44], [45, 46, 47, 48, 49]])
arr.reshape (1,50)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]])
arr.reshape(50,1).shape
(50, 1)
# You Can find the datatype of object in the array
arr.dtype
dtype('int32')
A data type object (an instance of numpy.dtype class) describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data:
Type of the data (integer, float, Python object, etc.) Size of the data (how many bytes is in e.g. the integer) Byte order of the data (little-endian or big-endian) If the data type is structured, an aggregate of other data types, (e.g., describing an array item consisting of an integer and a float), what are the names of the “fields” of the structure, by which they can be accessed, what is the data-type of each field, and which part of the memory block each field takes. If the data type is a sub-array, what is its shape and data type.
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.dtypes.html
We will explore how we are going to grab a prticular elements of an array
arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])
The simplest way to pick one or some elements of an array looks very similar to python lists:
# Get the value at the index 24
arr[24]
24
# Get the value in the range
print (arr[4:24])
print ('\n')
print (arr[0:10])
[ 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] [0 1 2 3 4 5 6 7 8 9]
Here Numpy is different from normal List in python since it is having ability to broadcast
# Setting the value with index range (Broadcasting)
arr[0:10]=88
# Show
arr
array([88, 88, 88, 88, 88, 88, 88, 88, 88, 88, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])
arr=np.arange (0,21)
arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20])
# Slicing is Very important
slice_of_arr=arr[0:16]
#Show slice
slice_of_arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
# Change the value of Slice
slice_of_arr[:]=50
slice_of_arr
array([50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50])
arr
array([50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 16, 17, 18, 19, 20])
Data is not copied, it's a view of the original array! This avoids memory problems!
# To get a copy , need to be explicit
arr_copy=arr.copy ()
arr_copy
array([50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 16, 17, 18, 19, 20])
Two-dimensional (2D) arrays are indexed by two subscripts, one for the row and one for the column. Each element in the 2D array must by the same type, either a primitive type or object type.
The general format is arr_2d[row][col] or arr_2d[row,col]. I recommend usually using the comma notation for clarity.
import numpy as np
matrix= np.array(([1,2,3],[4,5,6],[7,8,9]))
# Show the value
matrix
array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Indexing Rows
# Grab 3nd row
matrix[2]
array([7, 8, 9])
#row[1][2] #row[1,2]
# Indexing Column
# Grab 2nd column
matrix[0:,1:2]
array([[2], [5], [8]])
# Format is arr_2d[row][col] or arr_2d[row,col]
# Getting individual element value
matrix[1][0]
4
matrix[1][2]
6
# Getting individual element value
matrix[1,0]
4
matrix
array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Matrix slicing
# Shape (2,2) from top left corner
matrix[0:2,0:2]
array([[1, 2], [4, 5]])
# Matrix slicing
# Shape (2,2) from top right corner
matrix[0:2,1:3]
array([[2, 3], [5, 6]])
# Matrix slicing
# Shape (2,2) from bottom right corner
matrix[1:,1:]
array([[5, 6], [8, 9]])
arr=np.arange(25)
b=arr.reshape(5,5)
b
array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]])
b[3:,3:]
array([[18, 19], [23, 24]])
# Grab all the elements in 4th column
b[0:,3:4]
array([[ 3], [ 8], [13], [18], [23]])
Fancy indexing is conceptually simple: it means passing an array of indices to access multiple array elements at once.
lets see an example
# Set up matrix
matrix1 = np.zeros ((11,11))
# Length of matrix
matrix1_length = matrix1.shape[1]
for i in range (matrix1_length):
matrix1[i]+=i
matrix1
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.], [ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.], [ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.], [10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.], [12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12.], [14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14.], [16., 16., 16., 16., 16., 16., 16., 16., 16., 16., 16.], [18., 18., 18., 18., 18., 18., 18., 18., 18., 18., 18.], [20., 20., 20., 20., 20., 20., 20., 20., 20., 20., 20.]])
# Fancy Indexing allows The following
matrix1[[2,3,4,5]]
array([[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.], [ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.], [ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.], [10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.]])
matrix1[[5,6,7,8]]
array([[10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.], [12., 12., 12., 12., 12., 12., 12., 12., 12., 12., 12.], [14., 14., 14., 14., 14., 14., 14., 14., 14., 14., 14.], [16., 16., 16., 16., 16., 16., 16., 16., 16., 16., 16.]])
Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. Try google image searching NumPy indexing to fins useful images, like this one:
Let's briefly go over how to use brackets for selection based off of comparison operators.
arr = np.arange(1,22)
arr
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21])
arr>4
array([False, False, False, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True])
bool_arr= arr>5
bool_arr
array([False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True])
arr[bool_arr]
array([ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21])
arr[arr>3]
array([ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21])
x=6
arr[arr>x]
array([ 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21])
Arithmetic Operations. Input arrays for performing arithmetic operations such as add(), subtract(), multiply(), and divide() must be either of the same shape or should conform to array broadcasting rules.
You Can easily perform array with array arithematic , or scalar with array arithematic
lets see some examples
arr= np.arange (0,25)
arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24])
arr+arr
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48])
arr*arr
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576])
arr-arr
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
# Warning on division by zero, but not an error!
# Just replaced with nan
arr/arr
C:\Users\pskj0\Anaconda3\lib\site-packages\ipykernel_launcher.py:3: RuntimeWarning: invalid value encountered in true_divide This is separate from the ipykernel package so we can avoid doing imports until
array([nan, 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
# Also warning, but not an error instead infinity
1/arr
C:\Users\pskj0\Anaconda3\lib\site-packages\ipykernel_launcher.py:2: RuntimeWarning: divide by zero encountered in true_divide
array([ inf, 1. , 0.5 , 0.33333333, 0.25 , 0.2 , 0.16666667, 0.14285714, 0.125 , 0.11111111, 0.1 , 0.09090909, 0.08333333, 0.07692308, 0.07142857, 0.06666667, 0.0625 , 0.05882353, 0.05555556, 0.05263158, 0.05 , 0.04761905, 0.04545455, 0.04347826, 0.04166667])
arr**3
array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000, 1331, 1728, 2197, 2744, 3375, 4096, 4913, 5832, 6859, 8000, 9261, 10648, 12167, 13824], dtype=int32)
Numpy comes with many universal array functions, which are essentially just mathematical operations you can use to perform the operation across the array. Let's show some common ones:
# Taking square Roots
np.sqrt (arr)
array([0. , 1. , 1.41421356, 1.73205081, 2. , 2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. , 3.16227766, 3.31662479, 3.46410162, 3.60555128, 3.74165739, 3.87298335, 4. , 4.12310563, 4.24264069, 4.35889894, 4.47213595, 4.58257569, 4.69041576, 4.79583152, 4.89897949])
np.max(arr)
24
# Calculating exponential (e^)
np.exp(arr)
array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03, 2.98095799e+03, 8.10308393e+03, 2.20264658e+04, 5.98741417e+04, 1.62754791e+05, 4.42413392e+05, 1.20260428e+06, 3.26901737e+06, 8.88611052e+06, 2.41549528e+07, 6.56599691e+07, 1.78482301e+08, 4.85165195e+08, 1.31881573e+09, 3.58491285e+09, 9.74480345e+09, 2.64891221e+10])
np.max(arr) #same as arr.max()
24
np.sin(arr)
array([ 0. , 0.84147098, 0.90929743, 0.14112001, -0.7568025 , -0.95892427, -0.2794155 , 0.6569866 , 0.98935825, 0.41211849, -0.54402111, -0.99999021, -0.53657292, 0.42016704, 0.99060736, 0.65028784, -0.28790332, -0.96139749, -0.75098725, 0.14987721, 0.91294525, 0.83665564, -0.00885131, -0.8462204 , -0.90557836])
np.log(arr)
C:\Users\pskj0\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: RuntimeWarning: divide by zero encountered in log """Entry point for launching an IPython kernel.
array([ -inf, 0. , 0.69314718, 1.09861229, 1.38629436, 1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509, 2.39789527, 2.48490665, 2.56494936, 2.63905733, 2.7080502 , 2.77258872, 2.83321334, 2.89037176, 2.94443898, 2.99573227, 3.04452244, 3.09104245, 3.13549422, 3.17805383])