Marcos Duarte

Laboratory of Biomechanics and Motor Control http://demotu.org

Federal University of ABC, Brazil

*The Python programming language with its ecosystem for scientific programming has features, maturity, and a community of developers and users that makes it the ideal environment for the scientific community.*

*This talk will show some of these features and usage examples.*

*If you are viewing this notebook online (served by http://nbviewer.ipython.org), you can click the button 'View as Slides' on the toolbar above to start the slide show.*

In [1]:

```
from IPython.display import Image
Image(filename='../images/lifecycle_FPerez.png') # From F. Perez
```

Out[1]:

*Python is a programming language that lets you work more quickly and integrate your systems more effectively. You can learn to use Python and see almost immediate gains in productivity and lower maintenance costs* [python.org].

*Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built in data structures, combined with dynamic typing and dynamic binding, well suited for Rapid Application Development and for scripting or glue language to connect existing components. Python's simple, easy to learn syntax emphasizes readability and therefore reduces the cost of program maintenance. Python supports modules and packages, which encourages program modularity and code reuse. The Python interpreter and standard libraries are available without charge for all major platforms, and can be freely distributed* [Python documentation].

As a scientist, what I do it's similar to this other fellow:

In [2]:

```
from IPython.display import YouTubeVideo
YouTubeVideo('9ZlBUglE6Hc', width=480, height=360, rel=0)
```

Out[2]:

- Numpy: fundamental package for scientific computing with a N-dimensional array package.
- Scipy: numerical routines for scientific computing.
- Matplotlib: comprehensive 2D Plotting.
- Sympy: symbolic mathematics.
- Pandas: data structures and data analysis tools.
- Jupyter Notebook: web application for creating and sharing documents with live code, equations, visualizations and text.
- Statsmodels: to explore data, estimate statistical models, and perform statistical tests.
- Scikit-learn: tools for data mining and data analysis (including machine learning).
- Pillow: Python Imaging Library.
- Spyder: interactive development environment.

Python is not the best programming language for all needs and for all people. There is no such language. But, if you are doing scientific computing, chances are that Python is perfect for you because:

- Python is free, open source, and cross-platform.
- Python is easy to learn, with readable code, well documented, and with a huge and friendly user community.
- Python is a real programming language, able to handle a variety of problems, easy to scale from small to huge problems, and easy to integrate with other systems (including other programming languages).
- Python code is not the fastest but Python is one the fastest languages for programming. It is not uncommon in science to care more about the time we spend programming than the time the program took to run. But if code speed is important, one can easily integrate in different ways a code written in other languages (such as C and Fortran) with Python.
- The Jupyter Notebook is a versatile tool for programming, data visualization, plotting, simulation, numeric and symbolic mathematics, and writing for daily use.

In [3]:

```
from IPython.display import IFrame
IFrame('http://cacm.acm.org/blogs/blog-cacm/176450-python-is-now-the-most-popular-' +
'introductory-teaching-language-at-top-us-universities/fulltext',
width='100%', height=450)
```

Out[3]:

The Jupyter Notebook App is a server-client application that allows editing and running notebook documents via a web browser. The Jupyter Notebook App can be executed on a local desktop requiring no internet access (as described in this document) or installed on a remote server and accessed through the internet.

Notebook documents (or “notebooks”, all lower case) are documents produced by the Jupyter Notebook App which contain both computer code (e.g. python) and rich text elements (paragraph, equations, figures, links, etc...). Notebook documents are both human-readable documents containing the analysis description and the results (figures, tables, etc..) as well as executable documents which can be run to perform data analysis.

In [4]:

```
from IPython.display import IFrame
IFrame('https://jupyter.org/', width='100%', height=450)
```

Out[4]:

**The easy way**

The easiest way to get Python and the most popular packages for scientific programming is to install them with a Python distribution such as Anaconda. In fact, you don't even need to install Python in your computer, you can run Python for scientific programming in the cloud using python.org, SageMathCloud, Wakari, pythonanywhere, or repl.it.

**The hard way**

You can download Python and all individual packages you need and install them one by one. In general, it's not that difficult, but it can become challenging and painful for certain big packages heavily dependent on math, image visualization, and your operating system (i.e., Microsoft Windows).

Go to the *Anaconda* website and download the appropriate version for your computer (but download Anaconda3! for Python 3.x). The file is big (about 350 MB). From their website:

**Linux Install**

In your terminal window type and follow the instructions:

`bash Anaconda3-4.1.1-Linux-x86_64.sh`

**OS X Install**

For the graphical installer, double-click the downloaded .pkg file and follow the instructions

For the command-line installer, in your terminal window type and follow the instructions:

`bash Anaconda3-4.1.1-MacOSX-x86_64.sh`

**Windows**

Double-click the .exe file to install Anaconda and follow the instructions on the screen

In [5]:

```
# pip install version_information
%load_ext version_information
%version_information numpy, scipy, matplotlib, sympy, pandas, ipython, jupyter
```

Out[5]:

There is a lot of good material in the internet about Python for scientific computing, some of them are:

- How To Think Like A Computer Scientist or the interactive edition (book)
- Python Scientific Lecture Notes (lecture notes)
- Lectures on scientific computing with Python (lecture notes)
- A gallery of interesting IPython Notebooks

Once in the IPython notebook, if you type a simple mathematical expression and press Shift+Enter it will give the result of the expression:

In [6]:

```
1 + 2 - 5
```

Out[6]:

In [7]:

```
import math # use the import function to import the math library
math.sqrt(12)
```

Out[7]:

In [8]:

```
x = 1
y = 1 + math.pi
y
```

Out[8]:

- Bolleans: True, False
- NoneType: None
- Numbers: int, float, complex
- Sequences: list, tuple, range
- Text sequence: str
- Binary sequence: bytes, bytearray, memoryview
- Mapping: dict
- Set: set, frozenset
- Boolean operations: and, or, not
- Comparisons: <, <=, >, >=, ==, !=, is, is not
- Math operations: +, -, *, /, //, %, **
- Bitwise operations: |, ^, &, <<, >>, ~

In [9]:

```
s = 'P' + 'y' + 't' + 'h' + 'o' + 'n'
print(s)
print(s*5)
```

Strings can be subscripted (indexed); like in C, the first character of a string has subscript (index) 0:

In [10]:

```
print('s[0] = ', s[0], ' (s[index], start at 0)')
print('s[5] = ', s[5])
print('s[-1] = ', s[-1], ' (last element)')
print('s[:] = ', s[:], ' (all elements)')
print('s[1:] = ', s[1:], ' (from this index (inclusive) till the last (inclusive))')
print('s[2:4] = ', s[2:4], ' (from 1st index (inclusive) till 2nd index (exclusive))')
print('s[:2] = ', s[:2], ' (till this index, exclusive)')
print('s[:10] = ', s[:10], ' (Python handles the index if it''s larger than length)')
print('s[-10:] = ', s[-10:])
print('s[0:5:2] = ', s[0:5:2], ' (s[ini:end:step])')
print('s[::2] = ', s[::2], ' (s[::step], initial and final indexes can be omitted)')
print('s[0:5:-1] = ', s[::-1], ' (s[::-step] reverses the string)')
print('s[:2] + s[2:] = ', s[:2] + s[2:], ' (this sounds natural with Python indexing)')
```

In [11]:

```
def fibo(N):
"""Fibonacci series: the sum of two elements defines the next.
The series is calculated till the input parameter N and
returned as an ouput variable.
"""
a, b, c = 0, 1, []
while b < N:
c.append(b)
a, b = b, a + b
return c
```

In [12]:

```
fibo(9)
```

Out[12]:

In [13]:

```
def bmi(weight, height):
"""Body mass index calculus and categorization.
Enter the weight in kg and the height in m.
See http://en.wikipedia.org/wiki/Body_mass_index
"""
bmi = weight / height**2
if bmi < 15:
c = 'very severely underweight'
elif 15 <= bmi < 16:
c = 'severely underweight'
elif 16 <= bmi < 18.5:
c = 'underweight'
elif 18.5 <= bmi < 25:
c = 'normal'
elif 25 <= bmi < 30:
c = 'overweight'
elif 30 <= bmi < 35:
c = 'moderately obese'
elif 35 <= bmi < 40:
c = 'severely obese'
else:
c = 'very severely obese'
s = 'For a weight of {0:.1f} kg and a height of {1:.2f} m,\n\
the body mass index (bmi) is {2:.1f} kg/m2,\n\
which is considered {3:s}.'\
.format(weight, height, bmi, c)
print(s)
```

In [14]:

```
bmi(70, 1.90);
```

Numpy is the fundamental package for scientific computing in Python and has a N-dimensional array package convenient to work with numerical data. With Numpy it's much easier and faster to work with numbers grouped as 1-D arrays (a vector), 2-D arrays (like a table or matrix), or higher dimensions.

In [15]:

```
import numpy as np
x = np.array([1, 2, 3, 4, 5, 6])
print(x)
x = np.random.randn(2,4)
print(x)
```

*A moving-average filter has the general formula:*

Here are two different versions of a function to implement the moving-average filter:

In [16]:

```
import numpy as np
def mav1(x, window):
"""Moving average of 'x' with window size 'window'."""
y = np.empty(len(x)-window+1)
for i in range(len(y)):
y[i] = np.sum(x[i:i+window])/window
return y
def mav2(x, window):
"""Moving average of 'x' with window size 'window'."""
xsum = np.cumsum(x)
xsum[window:] = xsum[window:] - xsum[:-window]
return xsum[window-1:]/window
```

In [17]:

```
x = np.random.randn(300)/10
x[100:200] += 1
window = 10
print('Performance of mav1:')
%timeit mav1(x, window)
print('Performance of mav2:')
%timeit mav2(x, window)
```

Matplotlib is the most-widely used packge for plotting data in Python. Let's see some examples of it.

In [18]:

```
import matplotlib.pyplot as plt
#%matplotlib notebook
%matplotlib inline
import numpy as np
```

In [19]:

```
y1 = mav1(x, window)
y2 = mav2(x, window)
# plot
fig, ax = plt.subplots(1, 1, figsize=(8, 4))
ax.plot(x, 'b-', linewidth=1, label = 'raw data')
ax.plot(y1, 'y-', linewidth=2, label = 'moving average 1')
ax.plot(y2, 'g--', linewidth=2, label = 'moving average 2')
ax.legend(frameon=False, loc='upper right', fontsize=12)
ax.set_xlabel("Data #")
ax.set_ylabel("Amplitude")
ax.grid();
```

In [20]:

```
plt.figure(figsize=(8, 4))
plt.plot(x, 'b-', linewidth=1, label = 'raw data')
plt.plot(y1, 'y-', linewidth=2, label = 'moving average 1')
plt.plot(y2, 'g--', linewidth=2, label = 'moving average 2')
plt.legend(frameon=False, loc='upper right', fontsize=12)
plt.xlabel("Data #")
plt.ylabel("Amplitude")
plt.grid()
plt.show()
```

Plot figure in an external window (outside the ipython notebook area):

In [21]:

```
#%matplotlib qt
```

In [22]:

```
mu, sigma = 10, 2
x = mu + sigma * np.random.randn(1000)
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 4))
ax1.plot(x, 'ro')
ax1.set_title('Data')
ax1.grid()
n, bins, patches = ax2.hist(x, 25, normed=True, facecolor='r') # histogram
ax2.set_xlabel('Bins')
ax2.set_ylabel('Probability')
ax2.set_title('Histogram')
fig.suptitle('Another example using matplotlib', fontsize=18, y=1.02)
ax2.grid()
plt.tight_layout()
plt.show()
```

In [23]:

```
# get back the inline plot
#%matplotlib inline
#%matplotlib notebook
```

Instead of "`%matplotlib inline`

" you can use "`%matplotlib notebook`

" which gives you a nice toolbar for zooming, panning, etc. The caveat is that once "`%matplotlib notebook`

" is used you can't alternate between matplotlib backends as we just did.

Sympy is a package to perform symbolic mathematics in Python. Let's see some of its features:

In [24]:

```
from IPython.display import display
import sympy as sym
from sympy.interactive import printing
printing.init_printing()
```

Define some symbols and the create a second-order polynomial function (a.k.a., parabola), plot, and find the roots:

In [25]:

```
x, y = sym.symbols('x y')
y = -x**3 + 4*x
y
```

Out[25]:

In [26]:

```
from sympy.plotting import plot
%matplotlib inline
plot(y, (x, -3, 3));
```

In [27]:

```
sym.solve(y, x)
```

Out[27]:

Let's run stuff from:

This entire document was written in the Jupyter Notebook (which can be statically viewed here or downloaded here). If you are watching my presentation right now, these slides are just a visualization of the same notebook (probably using the RISE: "Live" Reveal.js Jupyter/IPython Slideshow Extension).