The goal of this tutorial is to introduce the MKS while advancing our knowledge of scientific Python.
The tutorial consists of 6 notebooks.
Introduce scientific computing with Python
Introduce MKS
Simple 1D filter
Cross-validation
Scaling coefficients
Improve performance, optimize the MKS to do a full Cahn-Hilliard simulation
numexper
, pyfftw
, parallel IPythonTo start:
$ git clone https://github.com/wd15/pymks
$ cd ~/pymks/notebooks
$ ipython notebook
Check your version of the pymks
repository. Should be the same as mine
!git log -1
commit bb64ae063b79a73093e77eac624101f3a364c499
Author: Daniel Wheeler <daniel.wheeler2@gmail.com>
Date: Sun Jan 12 21:26:10 2014 -0500
Finshed exercises and solutions.
Fix #31
if not, do the following
!git pull origin master
From github.com:wd15/pymks * branch master -> FETCH_HEAD Already up-to-date.
You can now branch the working copy to make your own changes
!git checkout -b my_branch
!git branch
Use the issue tracker
%matplotlib inline
%load_ext autoreload
%autoreload 2
import numpy as np
import matplotlib.pyplot as plt
import fipy as fp
import pymks
import sklearn
import numexpr
import line_profiler
import pyfftw
pymks.test()
plt.plot(np.sin(np.linspace(0, 2 * np.pi, 1000)))
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload
[<matplotlib.lines.Line2D at 0x1942dd90>]
A gentle introduction to scientific computing with Python.
Python has 4 important container types, lists, tuples, dictionaries and sets.
my_list = ['a', 1, 2, 1.1, 'a']
type(my_list)
list
my_tuple = ('a', 1, 2, 1.1, 'a')
type(my_tuple)
tuple
my_dict = {'a': 'a', 'b' : 1, 1 : 2, 2.1 : 1.1, 'e' : 'a'}
type(my_dict)
dict
my_set = set(my_list)
print my_set
type(my_set)
set(['a', 1, 2, 1.1])
set
Lists are mutable
my_list[1] = 2
my_list
['a', 2, 2, 1.1, 'a']
and can be generated with "list comprehensions"
[i**2 for i in range(4)]
[0, 1, 4, 9]
Tuples are immutable (unchangable)
my_tuple[1] = 2
Dictionaries map keys to values and support item assignment. They are hash tables. Hash table look up is independent of the number of key:value
pairs. key
must be something hashable.
print my_dict
{'a': 'a', 1: 2, 2.1: 1.1, 'b': 1, 'e': 'a'}
only immutable objects can be hashed
{[1] : 'a'}
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-98-a5f062075012> in <module>() ----> 1 {[1] : 'a'} TypeError: unhashable type: 'list'
{(1, 2) : 'b'}
{(1, 2): 'b'}
my_dict['key'] = 'item'
del my_dict['a']
print my_dict
{1: 2, 'b': 1, 'e': 'a', 2.1: 1.1, 'key': 'item'}
x = [1, 2, 3]
y = x
print x
print y
[1, 2, 3] [1, 2, 3]
x[2] = 4
print x
print y
[1, 2, 4] [1, 2, 4]
class A:
pass
x = A()
y = x
print x
print y
<__main__.A instance at 0x191fd3b0> <__main__.A instance at 0x191fd3b0>
x = A()
print x
print y
<__main__.A instance at 0x191fdb90> <__main__.A instance at 0x191fd3b0>
Why do we need Numpy if we already have lists and tuples?
np.array((0., 1, 2, 3, 1.1))
array([ 0. , 1. , 2. , 3. , 1.1])
np.array((0, 1, 2, 2, 1)).dtype
dtype('int64')
N = 100000
a = [1., 0.2] * N
b = [0.5, 0.4] * N
print a[:10]
print b[:10]
[1.0, 0.2, 1.0, 0.2, 1.0, 0.2, 1.0, 0.2, 1.0, 0.2] [0.5, 0.4, 0.5, 0.4, 0.5, 0.4, 0.5, 0.4, 0.5, 0.4]
def multiply(a, b):
for i in xrange(len(a)):
c = a[i] * b[i]
%timeit multiply(a, b)
100 loops, best of 3: 16 ms per loop
import numpy as np
a = np.array(a)
b = np.array(b)
%timeit c = a * b
1000 loops, best of 3: 224 µs per loop
Vector operations are more succinct
c = a * b
beats
for i in xrange(len(c)):
c[i] = a[i] * b[i]
Numpy indexing tricks
a = np.arange(10)**3
print a
print a[2]
print a[2:5]
[ 0 1 8 27 64 125 216 343 512 729] 8 [ 8 27 64]
Fancy assignment
a[:6:2] = 1000.
print a
[1000 1 1000 27 1000 125 216 343 512 729]
Reverse an array
print a[::-1]
[ 729 512 343 216 125 1000 27 1000 1 1000]
See http://wiki.scipy.org/Tentative_NumPy_Tutorial for more indexing and reshaping tricks.
web-based interactive computational environment
exectute code in other interpreters and port output into python
parallel and distributed computing
rich media, maths
Execute code from other interpreters
%%bash --out cpuinfo
cat /proc/cpuinfo | grep processor
print cpuinfo
Nproc = len(cpuinfo.splitlines())
print Nproc
processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7 8
%%bash --out notebooks
ls -al | grep [^\ ].ipynb
print notebooks
-rw-r--r-- 1 wd15 machine 2961 Jan 13 08:47 00 - Workshop Overview.ipynb -rw-r--r-- 1 wd15 machine 38458 Jan 13 14:35 01 - Python Intro.ipynb -rw-r--r-- 1 wd15 machine 20758 Jan 12 17:07 02 - MKS Intro.ipynb -rw-r--r-- 1 wd15 machine 9922 Jan 12 22:25 03 - Simple Filter.ipynb -rw-r--r-- 1 wd15 machine 14630 Jan 12 21:25 04 - Cross Validation.ipynb -rw-r--r-- 1 wd15 machine 7641 Jan 12 17:07 05 - Scaling Coefficients.ipynb -rw-r--r-- 1 wd15 machine 22817 Jan 12 21:24 06 - Improving Performance.ipynb -rw-r--r-- 1 wd15 machine 1707 Jan 13 08:51 reproducible_research_talk.ipynb -rw-r--r-- 1 wd15 machine 28177972 Jan 3 17:06 sandbox.ipynb -rw-r--r-- 1 wd15 machine 6746 Jan 12 21:25 Solutions.ipynb
sorted(notebooks.splitlines(), key=lambda k: int(k.split()[4]))[::-1]
['-rw-r--r-- 1 wd15 machine 28177972 Jan 3 17:06 sandbox.ipynb', '-rw-r--r-- 1 wd15 machine 38458 Jan 13 14:35 01 - Python Intro.ipynb', '-rw-r--r-- 1 wd15 machine 22817 Jan 12 21:24 06 - Improving Performance.ipynb', '-rw-r--r-- 1 wd15 machine 20758 Jan 12 17:07 02 - MKS Intro.ipynb', '-rw-r--r-- 1 wd15 machine 14630 Jan 12 21:25 04 - Cross Validation.ipynb', '-rw-r--r-- 1 wd15 machine 9922 Jan 12 22:25 03 - Simple Filter.ipynb', '-rw-r--r-- 1 wd15 machine 7641 Jan 12 17:07 05 - Scaling Coefficients.ipynb', '-rw-r--r-- 1 wd15 machine 6746 Jan 12 21:25 Solutions.ipynb', '-rw-r--r-- 1 wd15 machine 2961 Jan 13 08:47 00 - Workshop Overview.ipynb', '-rw-r--r-- 1 wd15 machine 1707 Jan 13 08:51 reproducible_research_talk.ipynb']
The %%script
cell magic lets you run a cell in a subprocess of any interpreter on your system, such as: bash, ruby, perl, zsh, R, etc.
Using the IPython Notebook rather than the command line for common tasks encourages better record keeping.
from IPython.parallel import Client
engines = Client()
engines
<IPython.parallel.client.client.Client at 0x1702c490>
Make a "direct view" of the engines
dview = engines[:]
print "engines IDs",engines.ids
dview.block = True
dview.activate()
engines IDs [0, 1, 2, 3, 4, 5, 6, 7]
Calculate eigenvalues for lots of small arrays.
N = 1000
M = 4
np.random.seed(201)
A = np.random.random((N, M, M))
Evec = np.zeros((N, M), dtype=complex)
print type(A)
print A.dtype
print A.shape
<type 'numpy.ndarray'> float64 (1000, 4, 4)
?np.linalg.eig
eig
calculates eigenvalues and eigenvectors, $A v_i = \lambda_i v_i$
np.linalg.eig([[1, 0], [0, 0.5]])
(array([ 1. , 0.5]), array([[ 1., 0.], [ 0., 1.]]))
for i in xrange(N):
Evec[i] = np.linalg.eig(A[i])[0]
This can be written in one line of code.
Evec_zip = np.array(zip(*map(np.linalg.eig, A))[0])
print np.allclose(Evec_zip, Evec)
True
Let's try this in parallel. First, place parts of A
on each of the engines.
dview.scatter('A', A)
How has A
been partitioned?
for a in dview['A']:
print a.shape
(125, 4, 4) (125, 4, 4) (125, 4, 4) (125, 4, 4) (125, 4, 4) (125, 4, 4) (125, 4, 4) (125, 4, 4)
Do the parallel calculation.
?px
%px import numpy as np
%px Evec = np.array(zip(*map(np.linalg.eig, A))[0])
%px print Evec.shape
[stdout:0] (125, 4) [stdout:1] (125, 4) [stdout:2] (125, 4) [stdout:3] (125, 4) [stdout:4] (125, 4) [stdout:5] (125, 4) [stdout:6] (125, 4) [stdout:7] (125, 4)
Gather the result.
Evec_parallel = dview.gather('Evec')
Evec_parallel.shape
(1000, 4)
Check that it worked.
np.allclose(Evec, Evec_parallel)
True
Use an alternative method for the np.array(zip(*map(np.linalg.eig, A))[0])
code fragment. Try using a list comprehension [i for i in range(4)]
along with np.concatenate
and np.reshape
.
Use %timeit
to compare the calculation speeds in serial, parallel and for different numbers of engines.
dview = engines[:Neng]
creates views on different numbers of enginesdview.block = True
. What happens if dview.block = False
?%px
knows which view to use (dview.activate()
).if done, capture the output with %%capture out
, parse the output, and then plot the time versus number of engines.