Taking some steps back. Say you want to do something as simple as iterating over a range of values. The standard way of doing this in python is
for i in range(3):
print(i)
0 1 2
This produces a list
l = range(3)
l
[0, 1, 2]
Python lists
have a method __iter__
help(l.__iter__)
Help on method-wrapper object: __iter__ = class method-wrapper(object) | Methods defined here: | | __call__(...) | x.__call__(...) <==> x(...) | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __hash__(...) | x.__hash__() <==> hash(x) | | __repr__(...) | x.__repr__() <==> repr(x) | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __objclass__ | | __self__
When the for
statement is used on a target l
, it will first run
li = iter(l)
li
<listiterator at 0x103dfdbd0>
dir(li)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__length_hint__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'next']
Here li
is an iterator, and any object that implements the possibility to run iter()
in it is called an iterable.
What defines an iterator is the existance of a next()
method.
li.next()
0
After an iterator for the list
has been created in the for
statement, l.next()
is going to be called over and over until we get a StopIteration
exception.
li.next()
1
li.next()
2
li.next()
--------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-9-2933b5c23264> in <module>() ----> 1 li.next() StopIteration:
This works by the object keeping track of where it is in the count.
Illustrative implementation of the for
statement as a function:
from __future__ import print_function
def my_for(item, iterable, iter_func):
it = iter(iterable)
while True:
try:
item = it.next()
# Iteration code goes here
iter_func(item)
except StopIteration:
break
my_for(i, range(3), print)
0 1 2
So what happened was that we called range
, which gave a list
, which was converted to a listiterator
, which we then called until it stopped us.
Imagine we want to iterate over a VERY long range, say
l = range(1000000);
Then we are spending time building this list
, followed by spending memory storing this list
. Only to convert it to a listiterator
when we iterate over it!
But counting isn't that hard, let us instead directly implement an iterator
class iter_range(object):
def __init__(self, stop):
self.current = 0
self.stop = stop
def next(self):
self.current += 1
if self.current >= self.stop:
raise StopIteration
else:
return self.current
def __iter__(self):
return self
This pattern of __iter__
returning self
and next()
returning the next iterate is referred to in Python as the iterator protocol.
L = [1,2,3,4]
iter(L)
<listiterator at 0x10396a0d0>
range_4 = iter(iter_range(4))
for i in range_4:
print(i)
1 2 3
range_4
<__main__.iter_range at 0x10396a310>
dir(range_4)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'current', 'next', 'stop']
Now we only need to keep track of two numbers rather than a million like above; all the intermediate numbers are calculated on the fly.
As you might know, this functionality is already available in Python with the much more optimal iterator object xrange
.
li = xrange(4)
iter(li)
<rangeiterator at 0x10396f1e0>
So what happend with your iterator once it's thrown the StopIteration
exception?
range_4.next()
--------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-22-f904677c93d7> in <module>() ----> 1 range_4.next() <ipython-input-14-6affdce826b3> in next(self) 7 self.current += 1 8 if self.current >= self.stop: ----> 9 raise StopIteration 10 11 else: StopIteration:
In Python terminology we call the iterator exhausted.
This is another instance where the pattern of defining something is very common as soon as you have some sort of data which you only need to touch once. This being python, there's a quicker and more convenient way of implicitly satisfying the iterator protocol.
def iter_range(stop):
current = 0
while current < stop:
yield current
current += 1
The yield
keyword can foremost be thought of as a replacement to return
.
range_4 = iter_range(4)
range_4
<generator object iter_range at 0x1039683c0>
When a function containing a yield
statement is called, it is not going to run the body of the function, but return a generator.
The first time the .next()
method of the generator is called it will execute up to, and including the yield
statement upon the first call. There, it will wait until the generators .next()
method is called again.
def iter_range(stop):
current = 0
print("Before loop")
while current < stop:
if current == "stop":
return
print("Before yield")
yield current
current += 1
print("After yield")
print("After loop")
range_4 = iter_range(4)
range_4.next()
Before loop Before yield
0
range_4.next()
After yield Before yield
1
When the body of the function runs to completion, in this case meaning we exit the while
loop and execute the final print
call, the generator will throw StopIteration
.
range_4.next()
After yield Before yield
2
range_4.next()
After yield Before yield
3
range_4.next()
--------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-32-f904677c93d7> in <module>() ----> 1 range_4.next() StopIteration:
After yield After loop
Of course, when the generators where introduced, they got very popular, but many generators are rather short and simple in turns of functionality, so much that even a simple function defenition might feel cumbersome to make. Therefore, in the later versions of pyhton there are comprehensive generators, just like there are comprehensive lists.
These are great for chaining and filtering generators
range_5 = iter_range(5)
range_5_even = (i for i in range_5 if i % 2 == 0)
range_5_even
<generator object <genexpr> at 0x103968410>
for i in range_5_even:
print(i)
Before loop Before yield 0 After yield Before yield After yield Before yield 2 After yield Before yield After yield Before yield 4 After yield After loop
The lazy nature of iterators makes it a good idea to chain them. You can make small logical building blocks of generators, and wait with executing all these blocks until you have a chain with all the processing you need.
Large data input -> generator -> generator -> generator -> for x in s:
The for
statement will finally pull the data through all the generators.
An in depth slideshow about this is available here: http://www.dabeaz.com/generators/Generators.pdf
itertools
¶http://docs.python.org/2/library/itertools.html
The built in module itertools
contains many common iterators and iterator versions of built in list functions, as well as some combinatoric generators.
Some of the more self explanatory are chain()
, count()
, cycle()
, ifilter()
, islice()
, izip()
, takewhile()
.
tee(it, n)
¶Takes and iterator it
and copies it in to n
from itertools import tee
range_5 = iter_range(5)
r1, r2 = tee(range_5, 2)
r1.next()
Before loop Before yield
0
r2.next()
0
This enables you to work around the limitation of only being able to use an iterator once. But it can be quite memory intensive, so you might consier using a list
in stead if you feel like you need this.
Example use: Say we want to get ordered pairs of an iterator
from itertools import izip
def pairs(it):
a, b = tee(it, 2)
b.next() # Advance one step in the second copy
return izip(a, b)
range_4 = iter_range(4)
pairs_4 = pairs(range_4)
for i in pairs_4:
print(i)
Before loop Before yield After yield Before yield (0, 1) After yield Before yield (1, 2) After yield Before yield (2, 3) After yield After loop
(2*i for i in range(3))
<generator object <genexpr> at 0x103968460>
range_4 = iter_range(4)
dir(range_4)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'gi_code', 'gi_frame', 'gi_running', 'next', 'send', 'throw']
One can send values to a generator using the .send()
method.
This is a way of using a generator "in reverse".
The syntax is a bit unintuitive, so let's illustrate with a simple example
def cr():
while True:
n = (yield)
yield 1
c = cr()
c.next()
c.next()
1
c.send(2)
c.next()
1
c.send()
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-67-5623d1f10680> in <module>() ----> 1 c.send() TypeError: send() takes exactly one argument (0 given)
def doubler():
while True:
n = (yield)
print(2*n)
d = doubler()
d.next()
for i in range_4:
d.send(i)
Before loop Before yield 0 After yield Before yield 2 After yield Before yield 4 After yield Before yield 6 After yield After loop
d.send(7)
14
d.send(10)
20
The doubler
function is a coroutine, in contrast to a generator, which is a producer, it can be seen as a consumer. It sits in memory, doing nothing, until a value is sent to it for consumption.
Some notes about the above code:
.next()
, since that is how we got to the point of the code that has the yield
statement.send()
is called, the argument will be inputted as the value of yield
.def doubler(target):
while True:
n = (yield)
target.send(2 * n)
def halfer():
while True:
n = (yield)
print(n / 2)
h = halfer()
h.next()
doubler_halfer = doubler(h)
doubler_halfer.next()
for i in range(3):
doubler_halfer.send(i)
0 1 2
Because calling the .next()
s get annoying after a while, let us quickly define a decorator for taking care of this
def coroutine(func):
def start(*args, **kwargs):
cr = func(*args, **kwargs)
cr.next()
return cr
return start
@coroutine
def doubler(target):
while True:
n = (yield)
target.send(2. * n)
@coroutine
def halfer(target):
while True:
n = (yield)
target.send(n / 2.)
@coroutine
def printer():
while True:
n = (yield)
print(str(n))
doubler_halfer_printer = doubler(halfer(printer()))
for i in range(4):
doubler_halfer_printer.send(i)
0.0 1.0 2.0 3.0
By combining coroutines we can make pipes which we are pushing data through:
source -> coroutine -> coroutine -> coroutine -> sink
The final coroutine in the chain, which doesn't pass data along to another coroutine, is referred to as a sink.
Conceptually this is very different from using generators, but we're getting the same result in the end (lazy dataprocessing), so why would anyone want to use coroutines?
Unlike generators, coroutines can be branched
@coroutine
def broadcaster(targets):
while True:
value = (yield)
for target in targets:
target.send(value)
stuff = broadcaster([halfer(printer()), doubler(printer())])
for i in range(5):
stuff.send(i)
0.0 0.0 0.5 2.0 1.0 4.0 1.5 6.0 2.0 8.0
To illustrate how the execution in a generator is suspended before it's told to move one, consider this simple generator:
def gen():
yield 1
yield 2
g = gen()
g.next()
1
g.next()
2
The next time we run .next()
, the function defining the generator will finish, and implicitly return None
. This will cause the StopIteration
exception to be thrown.
g.next()
--------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-5-d7e53364a9a7> in <module>() ----> 1 g.next() StopIteration:
And compare that to this coroutine (which technically also is a generator)
@coroutine
def cr():
a = (yield)
print(a)
c = cr()
Once the coroutine is initiated it will just wait until it's told to take control of the execution. When we call .send()
of it, it will take control of execution, run everything passed the yield
. This will again cause the coroutine (generator) to implicitly return None
c.send(34)
--------------------------------------------------------------------------- StopIteration Traceback (most recent call last) <ipython-input-12-3c946281d63b> in <module>() ----> 1 c.send(34) StopIteration:
34
Note that the type of c
is a generator.
type(c)
generator
from termcolor import colored
for i in range(1, 100):
if not i % 7:
print(colored(i, "red"), end=" ")
else:
print(colored(i, "cyan"), end=" ")
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
@coroutine
def colorer(color, target):
while True:
text = str((yield))
target.send(colored(text, color))
stuff = broadcaster([halfer(colorer("red", printer())), \
doubler(colorer("cyan", printer()))])
for i in range(5):
stuff.send(i)
0.0 0.0 0.5 2.0 1.0 4.0 1.5 6.0 2.0 8.0
stuff.close()
Of course, just like with generators, the internal state of the coroutine is preserved until it returns, meaning the input we send to them can be used to change the state.
Coroutines have a method .close()
, this will cause the coroutine to throw a GeneratorExit
exception at the yield
point.
@coroutine
def summer(target):
s = 0
try:
while True:
s += (yield)
except GeneratorExit:
target.send(s)
return
a = summer(printer())
for i in range(4):
a.send(i)
a.close()
6
@coroutine
def router(func, true_target, false_target):
while True:
v = (yield)
if func(v):
true_target.send(v)
else:
false_target.send(v)
@coroutine
def averager(target):
count = 0
s = 0.
while True:
s += (yield)
count += 1
target.send(s / count)
p = printer()
pipeline = router(lambda i: i % 2, colorer("red", p), averager(colorer("cyan", p)))
for i in range(10):
pipeline.send(i)
0.0 1 1.0 3 2.0 5 3.0 7 4.0 9
As an example, let's assume we want to check some letter frequencies in four digit numerals in different languages
We produce the data with this generator:
from random import randint
def number_producer(n=4):
fstr = '{0:0' + str(n) + 'd}'
n_max = 10 ** (n) - 1
while True:
yield fstr.format(randint(0, n_max))
producer = number_producer(10)
producer.next()
'3070970711'
And we map the digits to languages by these dictionaries
english = {0: "zero", 1: "one", 2: "two", 3: "three", 4: "four", \
5: "five", 6: "six", 7: "seven", 8: "eight", 9: "nine"}
svenska = {0: "noll", 1: "ett", 2: "tva", 3: "tre", 4: "fyra", \
5: "fem", 6: "sex", 7: "sju", 8: "atta", 9: "nio"}
The subtasks we wish to perform are these:
@coroutine
def digit_splitter(target):
try:
while True:
number = (yield)
for digit in number:
target.send(int(digit))
except GeneratorExit:
target.close()
@coroutine
def digit_mapper(map, target):
try:
while True:
digit = (yield)
target.send(map[digit])
except GeneratorExit:
target.close()
from collections import Counter
@coroutine
def letter_counter(target):
c = Counter()
try:
while True:
c.update((yield))
except GeneratorExit:
target.send(c)
target.close()
@coroutine
def count_formatter(target, n=5):
while True:
counts = (yield)
for letter, count in counts.most_common(n):
s = letter + " - " + "|" * count
target.send(s)
@coroutine
def multiline_printer():
s = ""
try:
while True:
s += (yield) + "\n"
except GeneratorExit:
print(s)
@coroutine
def broadcaster(targets):
try:
while True:
value = (yield)
for target in targets:
target.send(value)
except GeneratorExit:
for target in targets:
target.close()
eng_result_interpreter = count_formatter(multiline_printer(), 10)
swe_result_interpreter = count_formatter(colorer("magenta", multiline_printer()))
eng_counter = digit_mapper(english, letter_counter(eng_result_interpreter))
swe_counter = digit_mapper(svenska, letter_counter(swe_result_interpreter))
number_analyser = digit_splitter(broadcaster([eng_counter, swe_counter]))
P = number_producer(4)
for i in xrange(10):
number_analyser.send(P.next())
number_analyser.close()
e - ||||||||||||||||||||||||||||||||||||||| r - ||||||||||||||||| o - |||||||||||||||| i - ||||||||||||||| t - |||||||||||||| n - ||||||||||||| h - ||||||||||| z - ||||||| f - |||||| s - |||||| t - |||||||||||||||||||||||| e - ||||||||||||||||| a - |||||||||||||| l - |||||||||||||| o - |||||||||||
Built in cProfile
%%file hanoi.py
"""Solve 'Towers of Hanoi'
"""
import sys
PRINT = False
@profile
def solve(g,n):
X = [sum(g[0])]
Y = [sum(g[1])]
Z = [sum(g[2])]
moved = 0
for i in range(2**n - 1):
tops = [a[0] for a in g]
movable = False
j = 0
while not movable:
max_legal = max([t for t in tops if g[j][0] % 2 != t % 2])
if g[j][0] != moved and g[j][0] > 0:
num_zeros = len([z for z in tops if z == 0])
if g[j][0] < max_legal or num_zeros > 0:
movable = True
if not movable:
j += 1
moved = g[j][0]
legal = False
k = 2
while not legal:
if (tops[k] % 2 != g[j][0] % 2 and g[j][0] < tops[k]) or tops[k] == 0:
legal = True
if not legal:
k -= 1
g[k] = [g[j][0]] + g[k]
g[j] = g[j][1::]
if PRINT:
print g
S = [sum(s) for s in g]
X += [S[0]]
Y += [S[1]]
Z += [S[2]]
return (X,Y,Z)
if __name__ == "__main__":
n = int(sys.argv[-1])
game = [range(1,n+1)+[0], [0], [0]]
X, Y, Z = solve(game,n)
Overwriting hanoi.py
%%bash
python -m cProfile hanoi.py 20
6756455 function calls in 14.319 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 14.319 14.319 hanoi.py:2(<module>) 1 12.494 12.494 14.319 14.319 hanoi.py:8(solve) 1513582 0.203 0.000 0.203 0.000 {len} 2097140 0.575 0.000 0.575 0.000 {max} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 2 0.031 0.015 0.031 0.015 {range} 3145728 1.017 0.000 1.017 0.000 {sum}
When one has a lot of functions to compare, just reading a text table might be unclear. There are various parsers for these profiling tables. One which particularly nice and web browser based is snakeviz.
pip install snakeviz
We profile again, but specify an output file
%%bash
python -m cProfile -o profile_name hanoi.py 18
Then we just call snakeviz with that file as argument, this will start a web server which displays the results.
%%script --bg bash
snakeviz profile_name
Starting job # 2 in a separate thread.
(These cells with %%bash
and %%script
are just like typing in a terminal window)
For instrumenting profilers, every time an action happens, e.g. executing a function, time points are saved and timings are calculated.
Statistical profilers samples the call stack at short intervals, and that data can be used to figure where most time is spent.
(Python Low Overhead Profiler)
%%bash
python -m plop.collector hanoi.py 21
profile output saved to /tmp/plop.out overhead was 1.6438374391e-05 per sample (0.0016438374391%)
!cat /tmp/plop.out
{(('<string>', 8, 'solve'), ('<string>', 2, '<module>'), ('/Users/valentinesvensson/.virtualenvs/devel/lib/python2.7/site-packages/plop/collector.py', 74, 'main'), ('/Users/valentinesvensson/.virtualenvs/devel/lib/python2.7/site-packages/plop/collector.py', 1, '<module>'), ('/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py', 62, '_run_code'), ('/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py', 136, '_run_module_as_main')): 2154}
%%script --bg bash
python -m plop.viewer --port=9876 --datadir=/tmp
Starting job # 3 in a separate thread.
%killbgscripts
All background processes were killed.
A (more well established) alternative to plop is statprof
Sometime one might want to now on a line per line basis what is taking up time.
pip install line_profiler
This requires you to put a @profile
decorator on the function(s) you wish to profile
%%bash
kernprof.py --line-by-line hanoi.py 12
Wrote profile results to hanoi.py.lprof
%%bash
python -m line_profiler hanoi.py.lprof
Timer unit: 1e-06 s File: hanoi.py Function: solve at line 7 Total time: 0.332164 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 7 @profile 8 def solve(g,n): 9 1 6 6.0 0.0 X = [sum(g[0])] 10 1 2 2.0 0.0 Y = [sum(g[1])] 11 1 2 2.0 0.0 Z = [sum(g[2])] 12 13 1 1 1.0 0.0 moved = 0 14 4096 5217 1.3 1.6 for i in range(2**n - 1): 15 16380 23871 1.5 7.2 tops = [a[0] for a in g] 16 4095 5216 1.3 1.6 movable = False 17 4095 5005 1.2 1.5 j = 0 18 12279 15471 1.3 4.7 while not movable: 19 32736 58374 1.8 17.6 max_legal = max([t for t in tops if g[j][0] % 2 != t % 2]) 20 8184 12901 1.6 3.9 if g[j][0] != moved and g[j][0] > 0: 21 23400 31866 1.4 9.6 num_zeros = len([z for z in tops if z == 0]) 22 5850 8690 1.5 2.6 if g[j][0] < max_legal or num_zeros > 0: 23 4095 5343 1.3 1.6 movable = True 24 8184 10186 1.2 3.1 if not movable: 25 4089 5318 1.3 1.6 j += 1 26 27 4095 5414 1.3 1.6 moved = g[j][0] 28 29 4095 5204 1.3 1.6 legal = False 30 4095 5195 1.3 1.6 k = 2 31 12279 15767 1.3 4.7 while not legal: 32 8184 16257 2.0 4.9 if (tops[k] % 2 != g[j][0] % 2 and g[j][0] < tops[k]) or tops[k] == 0: 33 4095 5386 1.3 1.6 legal = True 34 8184 10304 1.3 3.1 if not legal: 35 4089 5568 1.4 1.7 k -= 1 36 37 4095 9705 2.4 2.9 g[k] = [g[j][0]] + g[k] 38 4095 8777 2.1 2.6 g[j] = g[j][1::] 39 40 4095 5705 1.4 1.7 if PRINT: 41 print g 42 43 16380 28153 1.7 8.5 S = [sum(s) for s in g] 44 4095 8937 2.2 2.7 X += [S[0]] 45 4095 7506 1.8 2.3 Y += [S[1]] 46 4095 6815 1.7 2.1 Z += [S[2]] 47 48 1 2 2.0 0.0 return (X,Y,Z)
We spent quite a while talking about generators, and the primary gain from those are the memory usage. Commonly we would like to know where we can avoid some memory expansions.
pip install memory_profiler
pip install psutil
Then one must decorate the functions one wish to memory profile with a @profile
decorator, just like with the line_profiler
.
%%bash
python -m memory_profiler hanoi.py 6
Filename: hanoi.py Line # Mem usage Increment Line Contents ================================================ 7 @profile 8 8.145 MB 0.000 MB def solve(g,n): 9 8.145 MB 0.000 MB X = [sum(g[0])] 10 8.145 MB 0.000 MB Y = [sum(g[1])] 11 8.145 MB 0.000 MB Z = [sum(g[2])] 12 13 8.145 MB 0.000 MB moved = 0 14 8.160 MB 0.016 MB for i in range(2**n - 1): 15 8.246 MB 0.086 MB tops = [a[0] for a in g] 16 8.246 MB 0.000 MB movable = False 17 8.168 MB -0.078 MB j = 0 18 8.172 MB 0.004 MB while not movable: 19 8.246 MB 0.074 MB max_legal = max([t for t in tops if g[j][0] % 2 != t % 2]) 20 8.168 MB -0.078 MB if g[j][0] != moved and g[j][0] > 0: 21 8.246 MB 0.078 MB num_zeros = len([z for z in tops if z == 0]) 22 8.246 MB 0.000 MB if g[j][0] < max_legal or num_zeros > 0: 23 8.188 MB -0.059 MB movable = True 24 8.246 MB 0.059 MB if not movable: 25 8.242 MB -0.004 MB j += 1 26 27 8.246 MB 0.004 MB moved = g[j][0] 28 29 8.246 MB 0.000 MB legal = False 30 8.168 MB -0.078 MB k = 2 31 8.246 MB 0.078 MB while not legal: 32 8.250 MB 0.004 MB if (tops[k] % 2 != g[j][0] % 2 and g[j][0] < tops[k]) or tops[k] == 0: 33 8.188 MB -0.062 MB legal = True 34 8.246 MB 0.059 MB if not legal: 35 8.242 MB -0.004 MB k -= 1 36 37 8.250 MB 0.008 MB g[k] = [g[j][0]] + g[k] 38 8.250 MB 0.000 MB g[j] = g[j][1::] 39 40 8.160 MB -0.090 MB if PRINT: 41 print g 42 43 8.250 MB 0.090 MB S = [sum(s) for s in g] 44 8.250 MB 0.000 MB X += [S[0]] 45 8.250 MB 0.000 MB Y += [S[1]] 46 8.250 MB 0.000 MB Z += [S[2]] 47 48 8.250 MB 0.000 MB return (X,Y,Z)
/Users/valentinesvensson/.virtualenvs/devel/lib/python2.7/site-packages/memory_profiler.py:34: UserWarning: psutil module not found. memory_profiler will be slow warnings.warn("psutil module not found. memory_profiler will be slow")
%%bash
Do the about_generators.py
koans in python_koans
Run cProfile, plop, line_profiler and memory_profiler for the two scripts you created for the previous exercises.
With "the two scripts", I mean check_repo.py
and getting_data.py
.
Make a new directory in the root of the repository called "profile_results"
In there, put the results of the profiling, in the format {script name}.{profiler name}
. So the output from line_profiler
of getting_data.py
should be in a file called getting_data.py.line_profiler
.
Special note about plop
: For plop
, the ouput is hardcoded to end up in /tmp/plop.out
. Therefore, when you have run plop
for, e.g. getting_data.py
, you should move /tmp/plop.out
to a file named getting_data.py.plop
in the specified directory.
(It should be noted that storing profile results in a repository like you should do here is not a standard way of doing things, but it gives us a way to assess you for this assignment.)