Tutorial on Python for scientific computing¶

Marcos Duarte, Renato Naville Watanabe
Laboratory of Biomechanics and Motor Control
Federal University of ABC, Brazil

1 Scope of this tutorial
2 Python as a calculator
3 The import function
4 Object-oriented programming
5 Python and IPython help
6 Assignment and expressions
7 Variables and types
8 Built-in Constants
9 Logical (Boolean) operators
- 9.1 and, or, not
- 9.2 Comparisons
10 Indentation and whitespace
11 Control of flow
12 Function definition
13 Numeric data manipulation with Numpy
- 13.1 Interpolation
14 Read and save files
15 Ploting with matplotlib
16 Signal processing with Scipy
17 Symbolic mathematics with Sympy
18 Data analysis with pandas
19 More about Python

Scope of this tutorial¶

This will be a very brief tutorial on Python.
For a more complete tutorial about Python see A Whirlwind Tour of Python and Python Data Science Handbook for a specific tutorial about Python for scientific computing.

To use Python for scientific computing we need the Python program itself with its main modules and specific packages for scientific computing. See this notebook on how to install Python for scientific computing.
Once you get Python and the necessary packages for scientific computing ready to work, there are different ways to run Python, the main ones are:

open a terminal window in your computer and type python or ipython that the Python interpreter will start
run the Jupyter notebook and start working with Python in a browser
run Spyder, an interactive development environment (IDE)
run the Jupyter qtconsole, a more featured terminal
run Python online in a website such as https://www.pythonanywhere.com/ or Colaboratory
run Python using any other Python editor or IDE

We will use the Jupyter Notebook for this tutorial but you can run almost all the things we will see here using the other forms listed above.

Python as a calculator¶

Once in the Jupyter notebook, if you type a simple mathematical expression and press Shift+Enter it will give the result of the expression:

In [ ]:

1 + 2 - 25

In [ ]:

4/7

Using the print function, let's explore the mathematical operations available in Python:

In [ ]:

print('1+2 = ', 1+2, '\n', '4*5 = ', 4*5, '\n', '6/7 = ', 6/7, '\n', '8**2 = ', 8**2, sep='')

And if we want the square-root of a number:

In [ ]:

sqrt(9)

We get an error message saying that the sqrt function if not defined. This is because sqrt and other mathematical functions are available with the math module:

In [ ]:

import math

In [ ]:

math.sqrt(9)

In [ ]:

from math import sqrt

In [ ]:

sqrt(9)

The import function¶

We used the command 'import' to be able to call certain functions. In Python functions are organized in modules and packages and they have to be imported in order to be used.

A module is a file containing Python definitions (e.g., functions) and statements. Packages are a way of structuring Python’s module namespace by using “dotted module names”. For example, the module name A.B designates a submodule named B in a package named A. To be used, modules and packages have to be imported in Python with the import function.

Namespace is a container for a set of identifiers (names), and allows the disambiguation of homonym identifiers residing in different namespaces. For example, with the command import math, we will have all the functions and statements defined in this module in the namespace 'math.', for example, 'math.pi' is the π constant and 'math.cos()', the cosine function.

By the way, to know which Python version you are running, we can use one of the following modules:

In [ ]:

import sys
sys.version

And if you are in an IPython session:

In [ ]:

from IPython import sys_info
print(sys_info())

The first option gives information about the Python version; the latter also includes the IPython version, operating system, etc.

Object-oriented programming¶

Python is designed as an object-oriented programming (OOP) language. OOP is a paradigm that represents concepts as "objects" that have data fields (attributes that describe the object) and associated procedures known as methods.

This means that all elements in Python are objects and they have attributes which can be acessed with the dot (.) operator after the name of the object. We already experimented with that when we imported the module sys, it became an object, and we acessed one of its attribute: sys.version.

OOP as a paradigm is much more than defining objects, attributes, and methods, but for now this is enough to get going with Python.

Python and IPython help¶

To get help about any Python command, use help():

In [ ]:

help(math.degrees)

Or if you are in the IPython environment, simply add '?' to the function that a window will open at the bottom of your browser with the same help content:

In [ ]:

math.degrees?

And if you add a second '?' to the statement you get access to the original script file of the function (an advantage of an open source language), unless that function is a built-in function that does not have a script file, which is the case of the standard modules in Python (but you can access the Python source code if you want; it just does not come with the standard program for installation).

So, let's see this feature with another function:

In [ ]:

import scipy.fftpack
scipy.fftpack.fft??

To know all the attributes of an object, for example all the functions available in math, we can use the function dir:

In [ ]:

print(dir(math))

Tab completion in IPython¶

IPython has tab completion: start typing the name of the command (object) and press tab to see the names of objects available with these initials letters. When the name of the object is typed followed by a dot (math.), pressing tab will show all available attribites, scroll down to the desired attribute and press Enter to select it.

The four most helpful commands in IPython¶

These are the most helpful commands in IPython (from IPython tutorial):

? : Introduction and overview of IPython’s features.
%quickref : Quick reference.
help : Python’s own help system.
object? : Details about ‘object’, use ‘object??’ for extra details.

Comments¶

Comments in Python start with the hash character, #, and extend to the end of the physical line:

In [ ]:

# Import the math library to access more math stuff
import math
math.pi  # this is the pi constant; a useless comment since this is obvious

To insert comments spanning more than one line, use a multi-line string with a pair of matching triple-quotes: """ or ''' (we will see the string data type later). A typical use of a multi-line comment is as documentation strings and are meant for anyone reading the code:

In [ ]:

"""Documentation strings are typically written like that.

A docstring is a string literal that occurs as the first statement
in a module, function, class, or method definition.

"""

A docstring like above is useless and its output as a standalone statement looks uggly in IPython Notebook, but you will see its real importance when reading and writting codes.

Commenting a programming code is an important step to make the code more readable, which Python cares a lot.
There is a style guide for writting Python code (PEP 8) with a session about how to write comments.

Magic functions¶

IPython has a set of predefined ‘magic functions’ that you can call with a command line style syntax.
There are two kinds of magics, line-oriented and cell-oriented.
Line magics are prefixed with the % character and work much like OS command-line calls: they get as an argument the rest of the line, where arguments are passed without parentheses or quotes.
Cell magics are prefixed with a double %%, and they are functions that get as an argument not only the rest of the line, but also the lines below it in a separate argument.

Assignment and expressions¶

The equal sign ('=') is used to assign a value to a variable. Afterwards, no result is displayed before the next interactive prompt:

In [ ]:

x = 1

Spaces between the statements are optional but it helps for readability.

To see the value of the variable, call it again or use the print function:

In [ ]:

print(x)

Of course, the last assignment is that holds:

In [ ]:

x = 2
x = 3
x

In mathematics '=' is the symbol for identity, but in computer programming '=' is used for assignment, it means that the right part of the expresssion is assigned to its left part.
For example, 'x=x+1' does not make sense in mathematics but it does in computer programming:

In [ ]:

x = x + 1
print(x)

A value can be assigned to several variables simultaneously:

In [ ]:

x = y = 4
print(x)
print(y)

Several values can be assigned to several variables at once:

In [ ]:

x, y = 5, 6
print(x)
print(y)

And with that, you can do (!):

In [ ]:

x, y = y, x
print(x)
print(y)

Variables must be “defined” (assigned a value) before they can be used, or an error will occur:

In [ ]:

x = z

Variables and types¶

There are different types of built-in objects in Python (and remember that everything in Python is an object):

In [ ]:

import types
print(dir(types))

Let's see some of them now.

Numbers: int, float, complex¶

Numbers can an integer (int), float, and complex (with imaginary part).
Let's use the function type to show the type of number (and later for any other object):

In [ ]:

type(6)

A float is a non-integer number:

In [ ]:

math.pi

In [ ]:

type(math.pi)

Python (IPython) is showing math.pi with only 15 decimal cases, but internally a float is represented with higher precision.
Floating point numbers in Python are implemented using a double (eight bytes) word; the precison and internal representation of floating point numbers are machine specific and are available in:

In [ ]:

sys.float_info

Be aware that floating-point numbers can be trick in computers:

In [ ]:

0.1 + 0.2

In [ ]:

0.1 + 0.2 - 0.3

These results are not correct (and the problem is not due to Python). The error arises from the fact that floating-point numbers are represented in computer hardware as base 2 (binary) fractions and most decimal fractions cannot be represented exactly as binary fractions. As consequence, decimal floating-point numbers are only approximated by the binary floating-point numbers actually stored in the machine. See here for more on this issue.

A complex number has real and imaginary parts:

In [ ]:

1 + 2j

In [ ]:

print(type(1+2j))

Each part of a complex number is represented as a floating-point number. We can see them using the attributes .real and .imag:

In [ ]:

print((1 + 2j).real)
print((1 + 2j).imag)

Strings¶

Strings can be enclosed in single quotes or double quotes:

In [ ]:

s = 'string (str) is a built-in type in Python'
s

In [ ]:

type(s)

String enclosed with single and double quotes are equal, but it may be easier to use one instead of the other:

In [ ]:

'string (str) is a Python's built-in type'

In [ ]:

"string (str) is a Python's built-in type"

But you could have done that using the Python escape character '':

In [ ]:

'string (str) is a Python\'s built-in type'

Strings can be concatenated (glued together) with the + operator, and repeated with *:

In [ ]:

s = 'P' + 'y' + 't' + 'h' + 'o' + 'n'
print(s)
print(s*5)

Strings can be subscripted (indexed); like in C, the first character of a string has subscript (index) 0:

In [ ]:

print('s[0] = ', s[0], '  (s[index], start at 0)')
print('s[5] = ', s[5])
print('s[-1] = ', s[-1], '  (last element)')
print('s[:] = ', s[:], '  (all elements)')
print('s[1:] = ', s[1:], '  (from this index (inclusive) till the last (inclusive))')
print('s[2:4] = ', s[2:4], '  (from first index (inclusive) till second index (exclusive))')
print('s[:2] = ', s[:2], '  (till this index, exclusive)')
print('s[:10] = ', s[:10], '  (Python handles the index if it is larger than the string length)')
print('s[-10:] = ', s[-10:])
print('s[0:5:2] = ', s[0:5:2], '  (s[ini:end:step])')
print('s[::2] = ', s[::2], '  (s[::step], initial and final indexes can be omitted)')
print('s[0:5:-1] = ', s[::-1], '  (s[::-step] reverses the string)')
print('s[:2] + s[2:] = ', s[:2] + s[2:], '  (because of Python indexing, this sounds natural)')

len()¶

Python has a built-in functon to get the number of itens of a sequence:

In [ ]:

help(len)

In [ ]:

s = 'Python'
len(s)

The function len() helps to understand how the backward indexing works in Python.
The index s[-i] should be understood as s[len(s) - i] rather than accessing directly the i-th element from back to front. This is why the last element of a string is s[-1]:

In [ ]:

print('s = ', s)
print('len(s) = ', len(s))
print('len(s)-1 = ',len(s) - 1)
print('s[-1] = ', s[-1])
print('s[len(s) - 1] = ', s[len(s) - 1])

Or, strings can be surrounded in a pair of matching triple-quotes: """ or '''. End of lines do not need to be escaped when using triple-quotes, but they will be included in the string. This is how we created a multi-line comment earlier:

In [ ]:

"""Strings can be surrounded in a pair of matching triple-quotes: \""" or '''.

End of lines do not need to be escaped when using triple-quotes,
but they will be included in the string.

"""

Lists¶

Values can be grouped together using different types, one of them is list, which can be written as a list of comma-separated values between square brackets. List items need not all have the same type:

In [ ]:

x = ['spam', 'eggs', 100, 1234]
x

Lists can be indexed and the same indexing rules we saw for strings are applied:

In [ ]:

x[0]

The function len() works for lists:

In [ ]:

len(x)

Tuples¶

A tuple consists of a number of values separated by commas, for instance:

In [ ]:

t = ('spam', 'eggs', 100, 1234)
t

The type tuple is why multiple assignments in a single line works; elements separated by commas (with or without surrounding parentheses) are a tuple and in an expression with an '=', the right-side tuple is attributed to the left-side tuple:

In [ ]:

a, b = 1, 2
print('a = ', a, '\nb = ', b)

Is the same as:

In [ ]:

(a, b) = (1, 2)
print('a = ', a, '\nb = ', b)

Sets¶

Python also includes a data type for sets. A set is an unordered collection with no duplicate elements.

In [ ]:

basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
fruit = set(basket)  # create a set without duplicates
fruit

As set is an unordered collection, it can not be indexed as lists and tuples.

In [ ]:

set(['orange', 'pear', 'apple', 'banana'])
'orange' in fruit  # fast membership testing

Dictionaries¶

Dictionary is a collection of elements organized keys and values. Unlike lists and tuples, which are indexed by a range of numbers, dictionaries are indexed by their keys:

In [ ]:

tel = {'jack': 4098, 'sape': 4139}
tel

In [ ]:

tel['guido'] = 4127
tel

In [ ]:

tel['jack']

In [ ]:

del tel['sape']
tel['irv'] = 4127
tel

In [ ]:

tel.keys()

In [ ]:

'guido' in tel

The dict() constructor builds dictionaries directly from sequences of key-value pairs:

In [ ]:

tel = dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
tel

Built-in Constants¶

False : false value of the bool type
True : true value of the bool type
None : sole value of types.NoneType. None is frequently used to represent the absence of a value.

In computer science, the Boolean or logical data type is composed by two values, true and false, intended to represent the values of logic and Boolean algebra. In Python, 1 and 0 can also be used in most situations as equivalent to the Boolean values.

Logical (Boolean) operators¶

and, or, not¶

and : logical AND operator. If both the operands are true then condition becomes true. (a and b) is true.
or : logical OR Operator. If any of the two operands are non zero then condition becomes true. (a or b) is true.
not : logical NOT Operator. Reverses the logical state of its operand. If a condition is true then logical NOT operator will make false.

Comparisons¶

The following comparison operations are supported by objects in Python:

== : equal
!= : not equal
< : strictly less than
<= : less than or equal
> : strictly greater than
>= : greater than or equal
is : object identity
is not : negated object identity

In [ ]:

True == False

In [ ]:

not True == False

In [ ]:

1 < 2 > 1

In [ ]:

True != (False or True)

In [ ]:

True != False or True

Indentation and whitespace¶

In Python, statement grouping is done by indentation (this is mandatory), which are done by inserting whitespaces, not tabs. Indentation is also recommended for alignment of function calling that span more than one line for better clarity.
We will see examples of indentation in the next session.

Control of flow¶

`if`...`elif`...`else`¶

Conditional statements (to peform something if another thing is True or False) can be implemmented using the if statement:

if expression:
   statement
elif:
   statement     
else:
   statement

elif (one or more) and else are optionals.
The indentation is obligatory.
For example:

In [ ]:

if True:
    pass

Which does nothing useful.

Let's use the if...elif...else statements to categorize the body mass index of a person:

In [ ]:

# body mass index
weight = 100  # kg
height = 1.70  # m
bmi = weight / height**2

In [ ]:

if bmi < 15:
    c = 'very severely underweight'
elif 15 <= bmi < 16:
    c = 'severely underweight'
elif 16 <= bmi < 18.5:
    c = 'underweight'
elif 18.5 <= bmi < 25:
    c = 'normal'
elif 25 <= bmi < 30:
    c = 'overweight'
elif 30 <= bmi < 35:
    c = 'moderately obese'
elif 35 <= bmi < 40:
    c = 'severely obese'
else:
    c = 'very severely obese'

print('For a weight of {0:.1f} kg and a height of {1:.2f} m,\n\
the body mass index (bmi) is {2:.1f} kg/m2,\nwhich is considered {3:s}.'\
      .format(weight, height, bmi, c))

for¶

The for statement iterates over a sequence to perform operations (a loop event).

for iterating_var in sequence:
    statements

In [ ]:

for i in [3, 2, 1, 'go!']:
    print(i, end=', ')

In [ ]:

for letter in 'Python':
    print(letter),

The `range()` function¶

The built-in function range() is useful if we need to create a sequence of numbers, for example, to iterate over this list. It generates lists containing arithmetic progressions:

In [ ]:

help(range)

In [ ]:

range(10)

In [ ]:

range(1, 10, 2)

In [ ]:

for i in range(10):
    n2 = i**2
    print(n2),

while¶

The while statement is used for repeating sections of code in a loop until a condition is met (this different than the for statement which executes n times):

while expression:
    statement

Let's generate the Fibonacci series using a while loop:

In [ ]:

# Fibonacci series: the sum of two elements defines the next
a, b = 0, 1
while b < 1000:
    print(b, end='   ')
    a, b = b, a+b

Function definition¶

A function in a programming language is a piece of code that performs a specific task. Functions are used to reduce duplication of code making easier to reuse it and to decompose complex problems into simpler parts. The use of functions contribute to the clarity of the code.

A function is created with the def keyword and the statements in the block of the function must be indented:

In [ ]:

def function():
    pass

As per construction, this function does nothing when called:

In [ ]:

function()

The general syntax of a function definition is:

def function_name( parameters ):
   """Function docstring.

   The help for the function

   """

   function body

   return variables

A more useful function:

In [ ]:

def fibo(N):
    """Fibonacci series: the sum of two elements defines the next.

    The series is calculated till the input parameter N and
    returned as an ouput variable.

    """

    a, b, c = 0, 1, []
    while b < N:
        c.append(b)
        a, b = b, a + b

    return c

In [ ]:

fibo(100)

In [ ]:

if 3 > 2:
       print('teste')

Let's implemment the body mass index calculus and categorization as a function:

In [ ]:

def bmi(weight, height):
    """Body mass index calculus and categorization.

    Enter the weight in kg and the height in m.
    See http://en.wikipedia.org/wiki/Body_mass_index

    """

    bmi = weight / height**2

    if bmi < 15:
     c = 'very severely underweight'
    elif 15 <= bmi < 16:
        c = 'severely underweight'
    elif 16 <= bmi < 18.5:
        c = 'underweight'
    elif 18.5 <= bmi < 25:
        c = 'normal'
    elif 25 <= bmi < 30:
      c = 'overweight'
    elif 30 <= bmi < 35:
        c = 'moderately obese'
    elif 35 <= bmi < 40:
        c = 'severely obese'
    else:
        c = 'very severely obese'

    s = 'For a weight of {0:.1f} kg and a height of {1:.2f} m,\
    the body mass index (bmi) is {2:.1f} kg/m2,\
    which is considered {3:s}.'\
    .format(weight, height, bmi, c)

    print(s)

In [ ]:

bmi(73, 1.70)

Numeric data manipulation with Numpy¶

Numpy is the fundamental package for scientific computing in Python and has a N-dimensional array package convenient to work with numerical data. With Numpy it's much easier and faster to work with numbers grouped as 1-D arrays (a vector), 2-D arrays (like a table or matrix), or higher dimensions. Let's create 1-D and 2-D arrays in Numpy:

In [ ]:

import numpy as np

In [ ]:

x1d = np.array([1, 2, 3, 4, 5, 6])
print(type(x1d))
x1d

In [ ]:

x2d = np.array([[1, 2, 3], [4, 5, 6]])
x2d

len() and the Numpy functions size() and shape() give information aboout the number of elements and the structure of the Numpy array:

In [ ]:

print('1-d array:')
print(x1d)
print('len(x1d) = ', len(x1d))
print('np.size(x1d) = ', np.size(x1d))
print('np.shape(x1d) = ', np.shape(x1d))
print('np.ndim(x1d) = ', np.ndim(x1d))
print('\n2-d array:')
print(x2d)
print('len(x2d) = ', len(x2d))
print('np.size(x2d) = ', np.size(x2d))
print('np.shape(x2d) = ', np.shape(x2d))
print('np.ndim(x2d) = ', np.ndim(x2d))

Create random data

In [ ]:

x = np.random.randn(4,3)
x

Joining (stacking together) arrays

In [ ]:

x = np.random.randint(0, 5, size=(2, 3))
print(x)
y = np.random.randint(5, 10, size=(2, 3))
print(y)

In [ ]:

np.vstack((x,y))

In [ ]:

np.hstack((x,y))

Create equally spaced data

In [ ]:

np.arange(start = 1, stop = 10, step = 2)

In [ ]:

np.linspace(start = 0, stop = 1, num = 11)

Interpolation¶

Consider the following data:

In [ ]:

y = [5,  4, 10,  8,  1, 10,  2,  7,  1,  3]

Suppose we want to create data in between the given data points (interpolation); for instance, let's try to double the resolution of the data by generating twice as many data:

In [ ]:

t = np.linspace(0, len(y), len(y))       # time vector for the original data
tn = np.linspace(0, len(y), 2 * len(y))  # new time vector for the new time-normalized data
yn = np.interp(tn, t, y)                 # new time-normalized data
yn

The key is the Numpy interp function, from its help:

interp(x, xp, fp, left=None, right=None)       
One-dimensional linear interpolation.   
Returns the one-dimensional piecewise linear interpolant to a function with given values at discrete data-points.

A plot of the data will show what we have done:

In [ ]:

%matplotlib inline
import matplotlib.pyplot as plt
plt.figure(figsize=(10,5))
plt.plot(t, y, 'bo-', lw=2, label='original data')
plt.plot(tn, yn, '.-', color=[1, 0, 0, .5], lw=2, label='interpolated')
plt.legend(loc='best', framealpha=.5)
plt.show()

For more about Numpy, see http://www.numpy.org/.

Read and save files¶

There are two kinds of computer files: text files and binary files:

Text file: computer file where the content is structured as a sequence of lines of electronic text. Text files can contain plain text (letters, numbers, and symbols) but they are not limited to such. The type of content in the text file is defined by the Unicode encoding (a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems).

Binary file: computer file where the content is encoded in binary form, a sequence of integers representing byte values.

Let's see how to save and read numeric data stored in a text file:

Using plain Python

In [ ]:

f = open("newfile.txt", "w")           # open file for writing
f.write("This is a test\n")            # save to file
f.write("And here is another line\n")  # save to file
f.close()
f = open('newfile.txt', 'r')           # open file for reading
f = f.read()                           # read from file
print(f)

In [ ]:

help(open)

Using Numpy

In [ ]:

import numpy as np
data = np.random.randn(3,3)
np.savetxt('myfile.txt', data, fmt="%12.6G")    # save to file
data = np.genfromtxt('myfile.txt', unpack=True) # read from file
data

Ploting with matplotlib¶

Matplotlib is the most-widely used packge for plotting data in Python. Let's see some examples of it.

In [ ]:

import matplotlib.pyplot as plt

Use the IPython magic %matplotlib inline to plot a figure inline in the notebook with the rest of the text (no longer needed if in Jupyter):

In [ ]:

%matplotlib inline

In [ ]:

import numpy as np

In [ ]:

t = np.linspace(0, 0.99, 100)
x = np.sin(2 * np.pi * 2 * t)
n = np.random.randn(100) / 5
plt.Figure(figsize=(12,8))
plt.plot(t, x, label='sine', linewidth=2)
plt.plot(t, x + n, label='noisy sine', linewidth=2)
plt.annotate(text='$sin(4 \pi t)$', xy=(.2, 1), fontsize=20, color=[0, 0, 1])
plt.legend(loc='best', framealpha=.5)
plt.xlabel('Time [s]')
plt.ylabel('Amplitude')
plt.title('Data plotting using matplotlib')
plt.show()

Use the IPython magic %matplotlib qt to plot a figure in a separate window (from where you will be able to change some of the figure proprerties), but it doesn't work in Google Colab:

In [ ]:

%matplotlib qt

In [ ]:

mu, sigma = 10, 2
x = mu + sigma * np.random.randn(1000)
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.plot(x, 'ro')
ax1.set_title('Data')
ax1.grid()

n, bins, patches = ax2.hist(x, 25, density=True, facecolor='r') # histogram
ax2.set_xlabel('Bins')
ax2.set_ylabel('Probability')
ax2.set_title('Histogram')
fig.suptitle('Another example using matplotlib', fontsize=18, y=1)
ax2.grid()

plt.tight_layout()
plt.show()

And a window with the following figure should appear (it doesn't work in Google Colab):

In [ ]:

from IPython.display import Image
Image(url="./../images/plot.png")

You can switch back and forth between inline and separate figure using the %matplotlib magic commands used above. There are plenty more examples with the source code in the matplotlib gallery.

In [ ]:

# get back the inline plot
%matplotlib inline

Signal processing with Scipy¶

The Scipy package has a lot of functions for signal processing, among them: Integration (scipy.integrate), Optimization (scipy.optimize), Interpolation (scipy.interpolate), Fourier Transforms (scipy.fftpack), Signal Processing (scipy.signal), Linear Algebra (scipy.linalg), and Statistics (scipy.stats). As an example, let's see how to use a low-pass Butterworth filter to attenuate high-frequency noise and how the differentiation process of a signal affects the signal-to-noise content. We will also calculate the Fourier transform of these data to look at their frequencies content.

In [ ]:

from scipy.signal import butter, filtfilt
import scipy.fftpack
freq = 100.
t = np.arange(0,1,.01);
w = 2*np.pi*1 # 1 Hz
y = np.sin(w*t)+0.1*np.sin(10*w*t)
# Butterworth filter
b, a = butter(4, (5/(freq/2)), btype = 'low')
y2 = filtfilt(b, a, y)
# 2nd derivative of the data
ydd = np.diff(y,2)*freq*freq   # raw data
y2dd = np.diff(y2,2)*freq*freq # filtered data
# frequency content
yfft = np.abs(scipy.fftpack.fft(y))/(y.size/2);   # raw data
y2fft = np.abs(scipy.fftpack.fft(y2))/(y.size/2); # filtered data
freqs = scipy.fftpack.fftfreq(y.size, 1./freq)
yddfft = np.abs(scipy.fftpack.fft(ydd))/(ydd.size/2);
y2ddfft = np.abs(scipy.fftpack.fft(y2dd))/(ydd.size/2);
freqs2 = scipy.fftpack.fftfreq(ydd.size, 1./freq)

And the plots:

In [ ]:

fig, ((ax1,ax2),(ax3,ax4)) = plt.subplots(2, 2, figsize=(12, 6))

ax1.set_title('Temporal domain', fontsize=14)
ax1.plot(t, y, 'r', linewidth=2, label = 'raw data')
ax1.plot(t, y2, 'b', linewidth=2, label = 'filtered @ 5 Hz')
ax1.set_ylabel('f')
ax1.legend(frameon=False, fontsize=12)

ax2.set_title('Frequency domain', fontsize=14)
ax2.plot(freqs[:int(yfft.size/4)], yfft[:int(yfft.size/4)],'r',  lw=2,label='raw data')
ax2.plot(freqs[:int(yfft.size/4)],y2fft[:int(yfft.size/4)],'b--',lw=2,label='filtered @ 5 Hz')
ax2.set_ylabel('FFT(f)')
ax2.legend(frameon=False, fontsize=12)

ax3.plot(t[:-2], ydd, 'r', linewidth=2, label = 'raw')
ax3.plot(t[:-2], y2dd, 'b', linewidth=2, label = 'filtered @ 5 Hz')
ax3.set_xlabel('Time [s]'); ax3.set_ylabel("f ''")

ax4.plot(freqs[:int(yddfft.size/4)], yddfft[:int(yddfft.size/4)], 'r', lw=2, label = 'raw')
ax4.plot(freqs[:int(yddfft.size/4)],y2ddfft[:int(yddfft.size/4)],'b--',lw=2, label='filtered @ 5 Hz')
ax4.set_xlabel('Frequency [Hz]'); ax4.set_ylabel("FFT(f '')")
plt.show()

For more about Scipy, see https://docs.scipy.org/doc/scipy/reference/tutorial/.

Symbolic mathematics with Sympy¶

Sympy is a package to perform symbolic mathematics in Python. Let's see some of its features:

In [ ]:

from IPython.display import display
import sympy as sym
from sympy.interactive import printing
printing.init_printing()

Define some symbols and the create a second-order polynomial function (a.k.a., parabola):

In [ ]:

x, y = sym.symbols('x y')
y = x**2 - 2*x - 3
y

Plot the parabola at some given range:

In [ ]:

from sympy.plotting import plot
%matplotlib inline
plot(y, (x, -3, 5));

And the roots of the parabola are given by:

In [ ]:

sym.solve(y, x)

We can also do symbolic differentiation and integration:

In [ ]:

dy = sym.diff(y, x)
dy

In [ ]:

sym.integrate(dy, x)

For example, let's use Sympy to represent three-dimensional rotations. Consider the problem of a coordinate system xyz rotated in relation to other coordinate system XYZ. The single rotations around each axis are illustrated by:

In [ ]:

from IPython.display import Image
Image(url="./../images/rotations.png")

The single 3D rotation matrices around Z, Y, and X axes can be expressed in Sympy:

In [ ]:

from IPython.core.display import Math
from sympy import symbols, cos, sin, Matrix, latex
a, b, g = symbols('alpha beta gamma')

RX = Matrix([[1, 0, 0], [0, cos(a), -sin(a)], [0, sin(a), cos(a)]])
display(Math(r'\mathbf{R_{X}}=' + latex(RX, mat_str = 'matrix')))

RY = Matrix([[cos(b), 0, sin(b)], [0, 1, 0], [-sin(b), 0, cos(b)]])
display(Math(r'\mathbf{R_{Y}}=' + latex(RY, mat_str = 'matrix')))

RZ = Matrix([[cos(g), -sin(g), 0], [sin(g), cos(g), 0], [0, 0, 1]])
display(Math(r'\mathbf{R_{Z}}=' + latex(RZ, mat_str = 'matrix')))

And using Sympy, a sequence of elementary rotations around X, Y, Z axes is given by:

In [ ]:

RXYZ = RZ*RY*RX
display(Math(r'\mathbf{R_{XYZ}}=' + latex(RXYZ, mat_str = 'matrix')))

Suppose there is a rotation only around X ($\alpha$) by $\pi/2$; we can get the numerical value of the rotation matrix by substituing the angle values:

In [ ]:

r = RXYZ.subs({a: np.pi/2, b: 0, g: 0})
r

And we can prettify this result:

In [ ]:

display(Math(r'\mathbf{R_{(\alpha=\pi/2)}}=' + latex(r.n(3, chop=True), mat_str = 'matrix')))

For more about Sympy, see http://docs.sympy.org/latest/tutorial/.

Data analysis with pandas¶

"pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python."

To work with labellled data, pandas has a type called DataFrame (basically, a matrix where columns and rows have may names and may be of different types) and it is also the main type of the software R. Fo ezample:

In [ ]:

import pandas as pd

In [ ]:

x = 5*['A'] + 5*['B']
x

In [ ]:

df = pd.DataFrame(np.random.rand(10,2), columns=['Level 1', 'Level 2'] )
df['Group'] = pd.Series(['A']*5 + ['B']*5)
plot = df.boxplot(by='Group')

In [ ]:

from pandas.plotting import scatter_matrix
df = pd.DataFrame(np.random.randn(100, 3), columns=['A', 'B', 'C'])
plot = scatter_matrix(df, alpha=0.5, figsize=(8, 6), diagonal='kde')

pandas is aware the data is structured and give you basic statistics considerint that and nicely formatted:

In [ ]:

df.describe()

For more on pandas, see this tutorial: http://pandas.pydata.org/pandas-docs/stable/10min.html.

More about Python¶

There is a lot of good material in the internet about Python for scientific computing, here is a small list of interesting stuff:

How To Think Like A Computer Scientist or the interactive edition (book)
Python Scientific Lecture Notes (lecture notes)
Python Data Science Handbook (tutorial/book)
A gallery of interesting Jupyter Notebooks

Tutorial on Python for scientific computing¶

Contents

Scope of this tutorial¶

Python as a calculator¶

The import function¶

Object-oriented programming¶

Python and IPython help¶

Tab completion in IPython¶

The four most helpful commands in IPython¶

Comments¶

Magic functions¶

Assignment and expressions¶

Variables and types¶

Numbers: int, float, complex¶

Strings¶

len()¶

Lists¶

Tuples¶

Sets¶

Dictionaries¶

Built-in Constants¶

Logical (Boolean) operators¶

and, or, not¶

Comparisons¶

Indentation and whitespace¶

Control of flow¶

if...elif...else¶

for¶

The range() function¶

while¶

Function definition¶

Numeric data manipulation with Numpy¶

Interpolation¶

Read and save files¶

Ploting with matplotlib¶

Signal processing with Scipy¶

Symbolic mathematics with Sympy¶

Data analysis with pandas¶

More about Python¶

`if`...`elif`...`else`¶

The `range()` function¶