#!/usr/bin/env python # coding: utf-8 # # Tutorial on Python for scientific computing # # > Marcos Duarte, Renato Naville Watanabe # > [Laboratory of Biomechanics and Motor Control](https://bmclab.pesquisa.ufabc.edu.br/pt/) # > Federal University of ABC, Brazil # #

A Jupyter Notebook

#

Contents

#
# ## Scope of this tutorial # # This will be a very brief tutorial on Python. # For a more complete tutorial about Python see [A Whirlwind Tour of Python](https://github.com/jakevdp/WhirlwindTourOfPython) and [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/) for a specific tutorial about Python for scientific computing. # # To use Python for scientific computing we need the Python program itself with its main modules and specific packages for scientific computing. [See this notebook on how to install Python for scientific computing](http://nbviewer.ipython.org/github/demotu/BMC/blob/master/notebooks/PythonInstallation.ipynb). # Once you get Python and the necessary packages for scientific computing ready to work, there are different ways to run Python, the main ones are: # # - open a terminal window in your computer and type `python` or `ipython` that the Python interpreter will start # - run the `Jupyter notebook` and start working with Python in a browser # - run `Spyder`, an interactive development environment (IDE) # - run the `Jupyter qtconsole`, a more featured terminal # - run Python online in a website such as [https://www.pythonanywhere.com/](https://www.pythonanywhere.com/) or [Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb) # - run Python using any other Python editor or IDE # # We will use the Jupyter Notebook for this tutorial but you can run almost all the things we will see here using the other forms listed above. # ## Python as a calculator # # Once in the Jupyter notebook, if you type a simple mathematical expression and press `Shift+Enter` it will give the result of the expression: # In[ ]: 1 + 2 - 25 # In[ ]: 4/7 # Using the `print` function, let's explore the mathematical operations available in Python: # In[ ]: print('1+2 = ', 1+2, '\n', '4*5 = ', 4*5, '\n', '6/7 = ', 6/7, '\n', '8**2 = ', 8**2, sep='') # And if we want the square-root of a number: # In[ ]: sqrt(9) # We get an error message saying that the `sqrt` function if not defined. This is because `sqrt` and other mathematical functions are available with the `math` module: # In[ ]: import math # In[ ]: math.sqrt(9) # In[ ]: from math import sqrt # In[ ]: sqrt(9) # ## The import function # # We used the command '`import`' to be able to call certain functions. In Python functions are organized in modules and packages and they have to be imported in order to be used. # # A module is a file containing Python definitions (e.g., functions) and statements. Packages are a way of structuring Python’s module namespace by using “dotted module names”. For example, the module name A.B designates a submodule named B in a package named A. To be used, modules and packages have to be imported in Python with the import function. # # Namespace is a container for a set of identifiers (names), and allows the disambiguation of homonym identifiers residing in different namespaces. For example, with the command import math, we will have all the functions and statements defined in this module in the namespace '`math.`', for example, '`math.pi`' is the π constant and '`math.cos()`', the cosine function. # By the way, to know which Python version you are running, we can use one of the following modules: # In[ ]: import sys sys.version # And if you are in an IPython session: # In[ ]: from IPython import sys_info print(sys_info()) # The first option gives information about the Python version; the latter also includes the IPython version, operating system, etc. # ## Object-oriented programming # # Python is designed as an object-oriented programming (OOP) language. OOP is a paradigm that represents concepts as "objects" that have data fields (attributes that describe the object) and associated procedures known as methods. # # This means that all elements in Python are objects and they have attributes which can be acessed with the dot (.) operator after the name of the object. We already experimented with that when we imported the module `sys`, it became an object, and we acessed one of its attribute: `sys.version`. # # OOP as a paradigm is much more than defining objects, attributes, and methods, but for now this is enough to get going with Python. # ## Python and IPython help # # To get help about any Python command, use `help()`: # In[ ]: help(math.degrees) # Or if you are in the IPython environment, simply add '?' to the function that a window will open at the bottom of your browser with the same help content: # In[ ]: get_ipython().run_line_magic('pinfo', 'math.degrees') # And if you add a second '?' to the statement you get access to the original script file of the function (an advantage of an open source language), unless that function is a built-in function that does not have a script file, which is the case of the standard modules in Python (but you can access the Python source code if you want; it just does not come with the standard program for installation). # # So, let's see this feature with another function: # In[ ]: import scipy.fftpack get_ipython().run_line_magic('pinfo2', 'scipy.fftpack.fft') # To know all the attributes of an object, for example all the functions available in `math`, we can use the function `dir`: # In[ ]: print(dir(math)) # ### Tab completion in IPython # # IPython has tab completion: start typing the name of the command (object) and press `tab` to see the names of objects available with these initials letters. When the name of the object is typed followed by a dot (`math.`), pressing `tab` will show all available attribites, scroll down to the desired attribute and press `Enter` to select it. # ### The four most helpful commands in IPython # # These are the most helpful commands in IPython (from [IPython tutorial](http://ipython.org/ipython-doc/dev/interactive/tutorial.html)): # # - `?` : Introduction and overview of IPython’s features. # - `%quickref` : Quick reference. # - `help` : Python’s own help system. # - `object?` : Details about ‘object’, use ‘object??’ for extra details. # ### Comments # # Comments in Python start with the hash character, #, and extend to the end of the physical line: # In[ ]: # Import the math library to access more math stuff import math math.pi # this is the pi constant; a useless comment since this is obvious # To insert comments spanning more than one line, use a multi-line string with a pair of matching triple-quotes: `"""` or `'''` (we will see the string data type later). A typical use of a multi-line comment is as documentation strings and are meant for anyone reading the code: # In[ ]: """Documentation strings are typically written like that. A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. """ # A docstring like above is useless and its output as a standalone statement looks uggly in IPython Notebook, but you will see its real importance when reading and writting codes. # # Commenting a programming code is an important step to make the code more readable, which Python cares a lot. # There is a style guide for writting Python code ([PEP 8](https://www.python.org/dev/peps/pep-0008/)) with a session about [how to write comments](https://www.python.org/dev/peps/pep-0008/#comments). # ### Magic functions # # IPython has a set of predefined ‘magic functions’ that you can call with a command line style syntax. # There are two kinds of magics, line-oriented and cell-oriented. # Line magics are prefixed with the % character and work much like OS command-line calls: they get as an argument the rest of the line, where arguments are passed without parentheses or quotes. # Cell magics are prefixed with a double %%, and they are functions that get as an argument not only the rest of the line, but also the lines below it in a separate argument. # ## Assignment and expressions # # The equal sign ('=') is used to assign a value to a variable. Afterwards, no result is displayed before the next interactive prompt: # In[ ]: x = 1 # Spaces between the statements are optional but it helps for readability. # # To see the value of the variable, call it again or use the print function: # In[ ]: x # In[ ]: print(x) # Of course, the last assignment is that holds: # In[ ]: x = 2 x = 3 x # In mathematics '=' is the symbol for identity, but in computer programming '=' is used for assignment, it means that the right part of the expresssion is assigned to its left part. # For example, 'x=x+1' does not make sense in mathematics but it does in computer programming: # In[ ]: x = x + 1 print(x) # A value can be assigned to several variables simultaneously: # In[ ]: x = y = 4 print(x) print(y) # Several values can be assigned to several variables at once: # In[ ]: x, y = 5, 6 print(x) print(y) # And with that, you can do (!): # In[ ]: x, y = y, x print(x) print(y) # Variables must be “defined” (assigned a value) before they can be used, or an error will occur: # In[ ]: x = z # ## Variables and types # # There are different types of built-in objects in Python (and remember that everything in Python is an object): # In[ ]: import types print(dir(types)) # Let's see some of them now. # ### Numbers: int, float, complex # # Numbers can an integer (int), float, and complex (with imaginary part). # Let's use the function `type` to show the type of number (and later for any other object): # In[ ]: type(6) # A float is a non-integer number: # In[ ]: math.pi # In[ ]: type(math.pi) # Python (IPython) is showing `math.pi` with only 15 decimal cases, but internally a float is represented with higher precision. # Floating point numbers in Python are implemented using a double (eight bytes) word; the precison and internal representation of floating point numbers are machine specific and are available in: # In[ ]: sys.float_info # Be aware that floating-point numbers can be trick in computers: # In[ ]: 0.1 + 0.2 # In[ ]: 0.1 + 0.2 - 0.3 # These results are not correct (and the problem is not due to Python). The error arises from the fact that floating-point numbers are represented in computer hardware as base 2 (binary) fractions and most decimal fractions cannot be represented exactly as binary fractions. As consequence, decimal floating-point numbers are only approximated by the binary floating-point numbers actually stored in the machine. [See here for more on this issue](http://docs.python.org/2/tutorial/floatingpoint.html). # A complex number has real and imaginary parts: # In[ ]: 1 + 2j # In[ ]: print(type(1+2j)) # Each part of a complex number is represented as a floating-point number. We can see them using the attributes `.real` and `.imag`: # In[ ]: print((1 + 2j).real) print((1 + 2j).imag) # ### Strings # # Strings can be enclosed in single quotes or double quotes: # In[ ]: s = 'string (str) is a built-in type in Python' s # In[ ]: type(s) # String enclosed with single and double quotes are equal, but it may be easier to use one instead of the other: # In[ ]: 'string (str) is a Python's built-in type' # In[ ]: "string (str) is a Python's built-in type" # But you could have done that using the Python escape character '\': # In[ ]: 'string (str) is a Python\'s built-in type' # Strings can be concatenated (glued together) with the + operator, and repeated with *: # In[ ]: s = 'P' + 'y' + 't' + 'h' + 'o' + 'n' print(s) print(s*5) # Strings can be subscripted (indexed); like in C, the first character of a string has subscript (index) 0: # In[ ]: print('s[0] = ', s[0], ' (s[index], start at 0)') print('s[5] = ', s[5]) print('s[-1] = ', s[-1], ' (last element)') print('s[:] = ', s[:], ' (all elements)') print('s[1:] = ', s[1:], ' (from this index (inclusive) till the last (inclusive))') print('s[2:4] = ', s[2:4], ' (from first index (inclusive) till second index (exclusive))') print('s[:2] = ', s[:2], ' (till this index, exclusive)') print('s[:10] = ', s[:10], ' (Python handles the index if it is larger than the string length)') print('s[-10:] = ', s[-10:]) print('s[0:5:2] = ', s[0:5:2], ' (s[ini:end:step])') print('s[::2] = ', s[::2], ' (s[::step], initial and final indexes can be omitted)') print('s[0:5:-1] = ', s[::-1], ' (s[::-step] reverses the string)') print('s[:2] + s[2:] = ', s[:2] + s[2:], ' (because of Python indexing, this sounds natural)') # ### len() # # Python has a built-in functon to get the number of itens of a sequence: # In[ ]: help(len) # In[ ]: s = 'Python' len(s) # The function len() helps to understand how the backward indexing works in Python. # The index s[-i] should be understood as s[len(s) - i] rather than accessing directly the i-th element from back to front. This is why the last element of a string is s[-1]: # In[ ]: print('s = ', s) print('len(s) = ', len(s)) print('len(s)-1 = ',len(s) - 1) print('s[-1] = ', s[-1]) print('s[len(s) - 1] = ', s[len(s) - 1]) # Or, strings can be surrounded in a pair of matching triple-quotes: """ or '''. End of lines do not need to be escaped when using triple-quotes, but they will be included in the string. This is how we created a multi-line comment earlier: # In[ ]: """Strings can be surrounded in a pair of matching triple-quotes: \""" or '''. End of lines do not need to be escaped when using triple-quotes, but they will be included in the string. """ # ### Lists # # Values can be grouped together using different types, one of them is list, which can be written as a list of comma-separated values between square brackets. List items need not all have the same type: # In[ ]: x = ['spam', 'eggs', 100, 1234] x # Lists can be indexed and the same indexing rules we saw for strings are applied: # In[ ]: x[0] # The function len() works for lists: # In[ ]: len(x) # ### Tuples # # A tuple consists of a number of values separated by commas, for instance: # In[ ]: t = ('spam', 'eggs', 100, 1234) t # The type tuple is why multiple assignments in a single line works; elements separated by commas (with or without surrounding parentheses) are a tuple and in an expression with an '=', the right-side tuple is attributed to the left-side tuple: # In[ ]: a, b = 1, 2 print('a = ', a, '\nb = ', b) # Is the same as: # In[ ]: (a, b) = (1, 2) print('a = ', a, '\nb = ', b) # ### Sets # # Python also includes a data type for sets. A set is an unordered collection with no duplicate elements. # In[ ]: basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana'] fruit = set(basket) # create a set without duplicates fruit # As set is an unordered collection, it can not be indexed as lists and tuples. # In[ ]: set(['orange', 'pear', 'apple', 'banana']) 'orange' in fruit # fast membership testing # ### Dictionaries # # Dictionary is a collection of elements organized keys and values. Unlike lists and tuples, which are indexed by a range of numbers, dictionaries are indexed by their keys: # In[ ]: tel = {'jack': 4098, 'sape': 4139} tel # In[ ]: tel['guido'] = 4127 tel # In[ ]: tel['jack'] # In[ ]: del tel['sape'] tel['irv'] = 4127 tel # In[ ]: tel.keys() # In[ ]: 'guido' in tel # The dict() constructor builds dictionaries directly from sequences of key-value pairs: # In[ ]: tel = dict([('sape', 4139), ('guido', 4127), ('jack', 4098)]) tel # ## Built-in Constants # # - **False** : false value of the bool type # - **True** : true value of the bool type # - **None** : sole value of types.NoneType. None is frequently used to represent the absence of a value. # In computer science, the Boolean or logical data type is composed by two values, true and false, intended to represent the values of logic and Boolean algebra. In Python, 1 and 0 can also be used in most situations as equivalent to the Boolean values. # ## Logical (Boolean) operators # ### and, or, not # - **and** : logical AND operator. If both the operands are true then condition becomes true. (a and b) is true. # - **or** : logical OR Operator. If any of the two operands are non zero then condition becomes true. (a or b) is true. # - **not** : logical NOT Operator. Reverses the logical state of its operand. If a condition is true then logical NOT operator will make false. # ### Comparisons # # The following comparison operations are supported by objects in Python: # # - **==** : equal # - **!=** : not equal # - **<** : strictly less than # - **<=** : less than or equal # - **\>** : strictly greater than # - **\>=** : greater than or equal # - **is** : object identity # - **is not** : negated object identity # In[ ]: True == False # In[ ]: not True == False # In[ ]: 1 < 2 > 1 # In[ ]: True != (False or True) # In[ ]: True != False or True # ## Indentation and whitespace # # In Python, statement grouping is done by indentation (this is mandatory), which are done by inserting whitespaces, not tabs. Indentation is also recommended for alignment of function calling that span more than one line for better clarity. # We will see examples of indentation in the next session. # ## Control of flow # # ### `if`...`elif`...`else` # # Conditional statements (to peform something if another thing is True or False) can be implemmented using the `if` statement: # ``` # if expression: # statement # elif: # statement # else: # statement # ``` # `elif` (one or more) and `else` are optionals. # The indentation is obligatory. # For example: # In[ ]: if True: pass # Which does nothing useful. # # Let's use the `if`...`elif`...`else` statements to categorize the [body mass index](http://en.wikipedia.org/wiki/Body_mass_index) of a person: # In[ ]: # body mass index weight = 100 # kg height = 1.70 # m bmi = weight / height**2 # In[ ]: if bmi < 15: c = 'very severely underweight' elif 15 <= bmi < 16: c = 'severely underweight' elif 16 <= bmi < 18.5: c = 'underweight' elif 18.5 <= bmi < 25: c = 'normal' elif 25 <= bmi < 30: c = 'overweight' elif 30 <= bmi < 35: c = 'moderately obese' elif 35 <= bmi < 40: c = 'severely obese' else: c = 'very severely obese' print('For a weight of {0:.1f} kg and a height of {1:.2f} m,\n\ the body mass index (bmi) is {2:.1f} kg/m2,\nwhich is considered {3:s}.'\ .format(weight, height, bmi, c)) # ### for # # The `for` statement iterates over a sequence to perform operations (a loop event). # ``` # for iterating_var in sequence: # statements # ``` # In[ ]: for i in [3, 2, 1, 'go!']: print(i, end=', ') # In[ ]: for letter in 'Python': print(letter), # #### The `range()` function # # The built-in function range() is useful if we need to create a sequence of numbers, for example, to iterate over this list. It generates lists containing arithmetic progressions: # In[ ]: help(range) # In[ ]: range(10) # In[ ]: range(1, 10, 2) # In[ ]: for i in range(10): n2 = i**2 print(n2), # ### while # # The `while` statement is used for repeating sections of code in a loop until a condition is met (this different than the `for` statement which executes n times): # ``` # while expression: # statement # ``` # Let's generate the Fibonacci series using a `while` loop: # In[ ]: # Fibonacci series: the sum of two elements defines the next a, b = 0, 1 while b < 1000: print(b, end=' ') a, b = b, a+b # ## Function definition # # A function in a programming language is a piece of code that performs a specific task. Functions are used to reduce duplication of code making easier to reuse it and to decompose complex problems into simpler parts. The use of functions contribute to the clarity of the code. # # A function is created with the `def` keyword and the statements in the block of the function must be indented: # In[ ]: def function(): pass # As per construction, this function does nothing when called: # In[ ]: function() # The general syntax of a function definition is: # ``` # def function_name( parameters ): # """Function docstring. # # The help for the function # # """ # # function body # # return variables # ``` # A more useful function: # In[ ]: def fibo(N): """Fibonacci series: the sum of two elements defines the next. The series is calculated till the input parameter N and returned as an ouput variable. """ a, b, c = 0, 1, [] while b < N: c.append(b) a, b = b, a + b return c # In[ ]: fibo(100) # In[ ]: if 3 > 2: print('teste') # Let's implemment the body mass index calculus and categorization as a function: # In[ ]: def bmi(weight, height): """Body mass index calculus and categorization. Enter the weight in kg and the height in m. See http://en.wikipedia.org/wiki/Body_mass_index """ bmi = weight / height**2 if bmi < 15: c = 'very severely underweight' elif 15 <= bmi < 16: c = 'severely underweight' elif 16 <= bmi < 18.5: c = 'underweight' elif 18.5 <= bmi < 25: c = 'normal' elif 25 <= bmi < 30: c = 'overweight' elif 30 <= bmi < 35: c = 'moderately obese' elif 35 <= bmi < 40: c = 'severely obese' else: c = 'very severely obese' s = 'For a weight of {0:.1f} kg and a height of {1:.2f} m,\ the body mass index (bmi) is {2:.1f} kg/m2,\ which is considered {3:s}.'\ .format(weight, height, bmi, c) print(s) # In[ ]: bmi(73, 1.70) # ## Numeric data manipulation with Numpy # # Numpy is the fundamental package for scientific computing in Python and has a N-dimensional array package convenient to work with numerical data. With Numpy it's much easier and faster to work with numbers grouped as 1-D arrays (a vector), 2-D arrays (like a table or matrix), or higher dimensions. Let's create 1-D and 2-D arrays in Numpy: # In[ ]: import numpy as np # In[ ]: x1d = np.array([1, 2, 3, 4, 5, 6]) print(type(x1d)) x1d # In[ ]: x2d = np.array([[1, 2, 3], [4, 5, 6]]) x2d # len() and the Numpy functions size() and shape() give information aboout the number of elements and the structure of the Numpy array: # In[ ]: print('1-d array:') print(x1d) print('len(x1d) = ', len(x1d)) print('np.size(x1d) = ', np.size(x1d)) print('np.shape(x1d) = ', np.shape(x1d)) print('np.ndim(x1d) = ', np.ndim(x1d)) print('\n2-d array:') print(x2d) print('len(x2d) = ', len(x2d)) print('np.size(x2d) = ', np.size(x2d)) print('np.shape(x2d) = ', np.shape(x2d)) print('np.ndim(x2d) = ', np.ndim(x2d)) # Create random data # In[ ]: x = np.random.randn(4,3) x # Joining (stacking together) arrays # In[ ]: x = np.random.randint(0, 5, size=(2, 3)) print(x) y = np.random.randint(5, 10, size=(2, 3)) print(y) # In[ ]: np.vstack((x,y)) # In[ ]: np.hstack((x,y)) # Create equally spaced data # In[ ]: np.arange(start = 1, stop = 10, step = 2) # In[ ]: np.linspace(start = 0, stop = 1, num = 11) # ### Interpolation # # Consider the following data: # In[ ]: y = [5, 4, 10, 8, 1, 10, 2, 7, 1, 3] # Suppose we want to create data in between the given data points (interpolation); for instance, let's try to double the resolution of the data by generating twice as many data: # In[ ]: t = np.linspace(0, len(y), len(y)) # time vector for the original data tn = np.linspace(0, len(y), 2 * len(y)) # new time vector for the new time-normalized data yn = np.interp(tn, t, y) # new time-normalized data yn # The key is the Numpy `interp` function, from its help: # # interp(x, xp, fp, left=None, right=None) # One-dimensional linear interpolation. # Returns the one-dimensional piecewise linear interpolant to a function with given values at discrete data-points. # # A plot of the data will show what we have done: # In[ ]: get_ipython().run_line_magic('matplotlib', 'inline') import matplotlib.pyplot as plt plt.figure(figsize=(10,5)) plt.plot(t, y, 'bo-', lw=2, label='original data') plt.plot(tn, yn, '.-', color=[1, 0, 0, .5], lw=2, label='interpolated') plt.legend(loc='best', framealpha=.5) plt.show() # For more about Numpy, see [http://www.numpy.org/](http://www.numpy.org/). # ## Read and save files # # There are two kinds of computer files: text files and binary files: # > Text file: computer file where the content is structured as a sequence of lines of electronic text. Text files can contain plain text (letters, numbers, and symbols) but they are not limited to such. The type of content in the text file is defined by the Unicode encoding (a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems). # > # > Binary file: computer file where the content is encoded in binary form, a sequence of integers representing byte values. # # Let's see how to save and read numeric data stored in a text file: # # **Using plain Python** # In[ ]: f = open("newfile.txt", "w") # open file for writing f.write("This is a test\n") # save to file f.write("And here is another line\n") # save to file f.close() f = open('newfile.txt', 'r') # open file for reading f = f.read() # read from file print(f) # In[ ]: help(open) # **Using Numpy** # In[ ]: import numpy as np data = np.random.randn(3,3) np.savetxt('myfile.txt', data, fmt="%12.6G") # save to file data = np.genfromtxt('myfile.txt', unpack=True) # read from file data # ## Ploting with matplotlib # # Matplotlib is the most-widely used packge for plotting data in Python. Let's see some examples of it. # In[ ]: import matplotlib.pyplot as plt # Use the IPython magic `%matplotlib inline` to plot a figure inline in the notebook with the rest of the text: # In[ ]: get_ipython().run_line_magic('matplotlib', 'inline') # In[ ]: import numpy as np # In[ ]: t = np.linspace(0, 0.99, 100) x = np.sin(2 * np.pi * 2 * t) n = np.random.randn(100) / 5 plt.Figure(figsize=(12,8)) plt.plot(t, x, label='sine', linewidth=2) plt.plot(t, x + n, label='noisy sine', linewidth=2) plt.annotate(text='$sin(4 \pi t)$', xy=(.2, 1), fontsize=20, color=[0, 0, 1]) plt.legend(loc='best', framealpha=.5) plt.xlabel('Time [s]') plt.ylabel('Amplitude') plt.title('Data plotting using matplotlib') plt.show() # Use the IPython magic `%matplotlib qt` to plot a figure in a separate window (from where you will be able to change some of the figure proprerties): # In[ ]: get_ipython().run_line_magic('matplotlib', 'qt') # In[ ]: mu, sigma = 10, 2 x = mu + sigma * np.random.randn(1000) fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4)) ax1.plot(x, 'ro') ax1.set_title('Data') ax1.grid() n, bins, patches = ax2.hist(x, 25, density=True, facecolor='r') # histogram ax2.set_xlabel('Bins') ax2.set_ylabel('Probability') ax2.set_title('Histogram') fig.suptitle('Another example using matplotlib', fontsize=18, y=1) ax2.grid() plt.tight_layout() plt.show() # And a window with the following figure should appear: # In[ ]: from IPython.display import Image Image(url="./../images/plot.png") # You can switch back and forth between inline and separate figure using the `%matplotlib` magic commands used above. There are plenty more examples with the source code in the [matplotlib gallery](http://matplotlib.org/gallery.html). # In[ ]: # get back the inline plot get_ipython().run_line_magic('matplotlib', 'inline') # ## Signal processing with Scipy # # The Scipy package has a lot of functions for signal processing, among them: Integration (scipy.integrate), Optimization (scipy.optimize), Interpolation (scipy.interpolate), Fourier Transforms (scipy.fftpack), Signal Processing (scipy.signal), Linear Algebra (scipy.linalg), and Statistics (scipy.stats). As an example, let's see how to use a low-pass Butterworth filter to attenuate high-frequency noise and how the differentiation process of a signal affects the signal-to-noise content. We will also calculate the Fourier transform of these data to look at their frequencies content. # In[ ]: from scipy.signal import butter, filtfilt import scipy.fftpack freq = 100. t = np.arange(0,1,.01); w = 2*np.pi*1 # 1 Hz y = np.sin(w*t)+0.1*np.sin(10*w*t) # Butterworth filter b, a = butter(4, (5/(freq/2)), btype = 'low') y2 = filtfilt(b, a, y) # 2nd derivative of the data ydd = np.diff(y,2)*freq*freq # raw data y2dd = np.diff(y2,2)*freq*freq # filtered data # frequency content yfft = np.abs(scipy.fftpack.fft(y))/(y.size/2); # raw data y2fft = np.abs(scipy.fftpack.fft(y2))/(y.size/2); # filtered data freqs = scipy.fftpack.fftfreq(y.size, 1./freq) yddfft = np.abs(scipy.fftpack.fft(ydd))/(ydd.size/2); y2ddfft = np.abs(scipy.fftpack.fft(y2dd))/(ydd.size/2); freqs2 = scipy.fftpack.fftfreq(ydd.size, 1./freq) # And the plots: # In[ ]: fig, ((ax1,ax2),(ax3,ax4)) = plt.subplots(2, 2, figsize=(12, 6)) ax1.set_title('Temporal domain', fontsize=14) ax1.plot(t, y, 'r', linewidth=2, label = 'raw data') ax1.plot(t, y2, 'b', linewidth=2, label = 'filtered @ 5 Hz') ax1.set_ylabel('f') ax1.legend(frameon=False, fontsize=12) ax2.set_title('Frequency domain', fontsize=14) ax2.plot(freqs[:int(yfft.size/4)], yfft[:int(yfft.size/4)],'r', lw=2,label='raw data') ax2.plot(freqs[:int(yfft.size/4)],y2fft[:int(yfft.size/4)],'b--',lw=2,label='filtered @ 5 Hz') ax2.set_ylabel('FFT(f)') ax2.legend(frameon=False, fontsize=12) ax3.plot(t[:-2], ydd, 'r', linewidth=2, label = 'raw') ax3.plot(t[:-2], y2dd, 'b', linewidth=2, label = 'filtered @ 5 Hz') ax3.set_xlabel('Time [s]'); ax3.set_ylabel("f ''") ax4.plot(freqs[:int(yddfft.size/4)], yddfft[:int(yddfft.size/4)], 'r', lw=2, label = 'raw') ax4.plot(freqs[:int(yddfft.size/4)],y2ddfft[:int(yddfft.size/4)],'b--',lw=2, label='filtered @ 5 Hz') ax4.set_xlabel('Frequency [Hz]'); ax4.set_ylabel("FFT(f '')") plt.show() # For more about Scipy, see [https://docs.scipy.org/doc/scipy/reference/tutorial/](https://docs.scipy.org/doc/scipy/reference/tutorial/). # ## Symbolic mathematics with Sympy # # Sympy is a package to perform symbolic mathematics in Python. Let's see some of its features: # In[ ]: from IPython.display import display import sympy as sym from sympy.interactive import printing printing.init_printing() # Define some symbols and the create a second-order polynomial function (a.k.a., parabola): # In[ ]: x, y = sym.symbols('x y') y = x**2 - 2*x - 3 y # Plot the parabola at some given range: # In[ ]: from sympy.plotting import plot get_ipython().run_line_magic('matplotlib', 'inline') plot(y, (x, -3, 5)); # And the roots of the parabola are given by: # In[ ]: sym.solve(y, x) # We can also do symbolic differentiation and integration: # In[ ]: dy = sym.diff(y, x) dy # In[ ]: sym.integrate(dy, x) # For example, let's use Sympy to represent three-dimensional rotations. Consider the problem of a coordinate system xyz rotated in relation to other coordinate system XYZ. The single rotations around each axis are illustrated by: # In[ ]: from IPython.display import Image Image(url="./../images/rotations.png") # The single 3D rotation matrices around Z, Y, and X axes can be expressed in Sympy: # In[ ]: from IPython.core.display import Math from sympy import symbols, cos, sin, Matrix, latex a, b, g = symbols('alpha beta gamma') RX = Matrix([[1, 0, 0], [0, cos(a), -sin(a)], [0, sin(a), cos(a)]]) display(Math(r'\mathbf{R_{X}}=' + latex(RX, mat_str = 'matrix'))) RY = Matrix([[cos(b), 0, sin(b)], [0, 1, 0], [-sin(b), 0, cos(b)]]) display(Math(r'\mathbf{R_{Y}}=' + latex(RY, mat_str = 'matrix'))) RZ = Matrix([[cos(g), -sin(g), 0], [sin(g), cos(g), 0], [0, 0, 1]]) display(Math(r'\mathbf{R_{Z}}=' + latex(RZ, mat_str = 'matrix'))) # And using Sympy, a sequence of elementary rotations around X, Y, Z axes is given by: # In[ ]: RXYZ = RZ*RY*RX display(Math(r'\mathbf{R_{XYZ}}=' + latex(RXYZ, mat_str = 'matrix'))) # Suppose there is a rotation only around X ($\alpha$) by $\pi/2$; we can get the numerical value of the rotation matrix by substituing the angle values: # In[ ]: r = RXYZ.subs({a: np.pi/2, b: 0, g: 0}) r # And we can prettify this result: # In[ ]: display(Math(r'\mathbf{R_{(\alpha=\pi/2)}}=' + latex(r.n(3, chop=True), mat_str = 'matrix'))) # For more about Sympy, see [http://docs.sympy.org/latest/tutorial/](http://docs.sympy.org/latest/tutorial/). # ## Data analysis with pandas # # > "[pandas](http://pandas.pydata.org/) is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python." # # To work with labellled data, pandas has a type called DataFrame (basically, a matrix where columns and rows have may names and may be of different types) and it is also the main type of the software [R](http://www.r-project.org/). Fo ezample: # In[ ]: import pandas as pd # In[ ]: x = 5*['A'] + 5*['B'] x # In[ ]: df = pd.DataFrame(np.random.rand(10,2), columns=['Level 1', 'Level 2'] ) df['Group'] = pd.Series(['A']*5 + ['B']*5) plot = df.boxplot(by='Group') # In[ ]: from pandas.plotting import scatter_matrix df = pd.DataFrame(np.random.randn(100, 3), columns=['A', 'B', 'C']) plot = scatter_matrix(df, alpha=0.5, figsize=(8, 6), diagonal='kde') # pandas is aware the data is structured and give you basic statistics considerint that and nicely formatted: # In[ ]: df.describe() # For more on pandas, see this tutorial: [http://pandas.pydata.org/pandas-docs/stable/10min.html](http://pandas.pydata.org/pandas-docs/stable/10min.html). # ## More about Python # # There is a lot of good material in the internet about Python for scientific computing, here is a small list of interesting stuff: # # - [How To Think Like A Computer Scientist](http://www.openbookproject.net/thinkcs/python/english2e/) or [the interactive edition](http://interactivepython.org/courselib/static/thinkcspy/index.html) (book) # - [Python Scientific Lecture Notes](http://scipy-lectures.github.io/) (lecture notes) # - [A Whirlwind Tour of Python](https://github.com/jakevdp/WhirlwindTourOfPython) (tutorial/book) # - [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/) (tutorial/book) # - [Lectures on scientific computing with Python](https://github.com/jrjohansson/scientific-python-lectures#lectures-on-scientific-computing-with-python) (lecture notes) # - [A gallery of interesting IPython Notebooks](https://github.com/jupyter/jupyter/wiki)