Third and final session on OOP. In this session we'll cover the concept of functors and how to build custom iterators.
A functor is an object that can be called as though it were a function.
Any class that has a __call__
special method is a functor.
So, when is it then useful to have a class behaving as a function? The answer is, when you need it to be able to remember. A function is in this sense quite dumb and cannot bring with it any kind of memory. It would have to pass out it's findings as an output and in again as an argument or inplace changing a variable (not good design patterns as it disturbs the ussage of the function and lowers the abstraction level/increases complexity at ussage).
The examples considers how to clean a text string polluted with various special charactors.
# string to be stripped of special characters
noisy_str = 't#hi?s i"s m¤y n%ois=y& str/in(g'
# list of special chars to be stripped for
special_chars = ['!', "'", '"', '#', '¤', '%', '&', '/', '(', ')', '=', '?']
First, let's start by getting the task done using a classic function definition:
def strip_string(string, special_chars):
clean_str = ''.join([ c for c in string if c not in special_chars ])
return clean_str
clean_str = strip_string(noisy_str, special_chars)
print(f'Cleaned string is: "{clean_str}"')
Cleaned string is: "this is my noisy string"
Now let's instead use a functor for the exact same task:
class StripStringOf():
def __init__(self, special_chars):
self.special_chars = special_chars
def __call__(self, string):
clean_str = ''.join([ c for c in string if c not in self.special_chars ])
return clean_str
strip_functor = StripStringOf(special_chars) # initialize functor
clean_str = strip_functor(noisy_str) # use functor as a function
print(f'Cleaned string is: "{clean_str}"')
Cleaned string is: "this is my noisy string"
Because of the __call__
special method we can now use the class instance just as if it's a function. Notice also how we saves the special_chars to self in the class constructor __init__
, so they're remembered for later use.
To underline the functor's ability to remember we'll extend the class with a memory of how many charactors it have stripped:
class StripStringOf():
def __init__(self, special_chars):
self.special_chars = special_chars
self.strip_nos = 0 # number of stripped characters
def __call__(self, string):
clean_str = ''.join([ c for c in string if c not in self.special_chars ])
self.strip_nos += len(string) - len(clean_str)
return clean_str
strip_functor = StripStringOf(special_chars) # initialize functor
clean_str = strip_functor(noisy_str)
print(f'Cleaned string is: "{clean_str}"')
print(f"So far I've stripped {strip_functor.strip_nos} special characters")
Cleaned string is: "this is my noisy string" So far I've stripped 9 special characters
clean_str = strip_functor(noisy_str)
print(f'Cleaned string is: "{clean_str}"')
print(f"So far I've stripped {strip_functor.strip_nos} special characters")
Cleaned string is: "this is my noisy string" So far I've stripped 18 special characters
So beside calling methods of a class instance you now also know how to make the instance itself callable. A useful concept when you need a slightly more intelligent function with memory capabilities.
Iterators are a very memory-efficient way of looping over something you would rather not load into memory at once. They're refered to as “lazy” because they only compute the next item once you need it and are therefore also ideal if you to not expect to complete a for-loop (e.g. by breaking out once the item you searched for is found).
Note that an iterable
is something you can iterate over while an iterator
is the object that does the actual iterating.
More and more of the built-in functions like filter
, map
, enumerate
, zip
, reversed
partly covered in session 3 now returns iterator objects instead of a list objects, which is good for performance but sometimes requires you to do a list()
call to force the lazy object to do its computations.
# without iterator
long_list = list(range(1000))
print('Type =', type(long_list))
print(f'Memory size = {long_list.__sizeof__()} bytes')
print('Sum =', sum(long_list))
Type = <class 'list'> Memory size = 9088 bytes Sum = 499500
# with iterator
long_list_iter = iter(range(1000))
print('Type =', type(long_list_iter))
print(f'Memory size = {long_list_iter.__sizeof__()} bytes')
print('Sum =', sum(long_list_iter))
Type = <class 'range_iterator'> Memory size = 32 bytes Sum = 499500
All of the built-in iterable data structures can be converted into iterators via iter()
:
iter([1,2,3])
<list_iterator at 0x263cfefeac8>
iter((1,2,3))
<tuple_iterator at 0x263cfef3da0>
iter('string')
<str_iterator at 0x263cfef3b70>
iter({'1':2, '2':3})
<dict_keyiterator at 0x263cff0d3b8>
This feature is utilised in e.g. for
loops where an iterator object is automatically created and followed by a next()
method call for each loop until the iterator is exusted.
A special type of iterator is the generator (and coroutines) previously presented in session 3. A generator is built by a function that has one or more yield expressions but can also be defined via the more compact generator expression.
# defines a generator that keeps squaring a number
def squares(num):
while True:
num = num**2
yield num
# initialize generator
generator = squares(2)
print('Type =', type(generator))
Type = <class 'generator'>
next(generator)
4
for x in generator:
print(x)
if x > 1000000: break
16 256 65536 4294967296
Generators provide a very convenient way to implement the iterator protocol with very limited code.
We can confirm that the Generator class
is a subclass of the Iterator class
by:
from collections.abc import Iterator, Generator # imports the respective abstract base classes
issubclass(Generator, Iterator)
True
issubclass(Iterator, Generator)
False
We can also confirm that the generator instance above is an instance of both the Generator and the Iterator class while the long_list_iter is only an instance of the Iterator class
isinstance(generator, Generator)
True
isinstance(generator, Iterator)
True
isinstance(long_list_iter, Generator)
False
isinstance(long_list_iter, Iterator)
True
So in most cases we can either convert one of the typical data structures into an interator or use the very convinient generator.
However, what if we need to build our own custom iterator not limited by the simple convient implementations above?
For this we'll have to define it as a class with the special methods __iter__
and __next__
.
The squaring generator example from before is reproduced below as a proper iterator:
class Squares():
def __init__(self, num):
self.num = num
def __iter__(self):
return self # returns self if an iterator is requested from a iterator
def __next__(self):
self.num = self.num**2
return self.num
# initialize instance
iterator = Squares(2)
next(iterator)
4
for x in iterator:
print(x)
if x > 1000000: break
16 256 65536 4294967296
This is e.g. useful if you're trying to loop over content on a remote service where you don't want to download everything up front and you need an intelligent object capable of dealing with connection issues. Maybe you would like it to try to reconnect 3 times before it throws an ConnectionLost Exception.
Remember to keep using git for version control of your code. Preferably this should become a new good habit of yours.
To get some hands-on expereince with functors you shall now implement your own Accumulator class. Instances of this class shall continuously be accumulating all incomming function call arguments. The remembered current sum should be stored in a private attribute and made available as a property.
Below is illustrated how your functor should behaive:
from solution_module import Accumulator
# initializing two instances with different starting values
adder_A = Accumulator(start=2)
adder_B = Accumulator(start=3)
adder_A(3)
I've now added 3 to my sum
adder_B(4)
I've now added 4 to my sum
adder_A(5)
I've now added 5 to my sum
adder_B(6)
I've now added 6 to my sum
adder_A.current_sum
10
adder_B.current_sum
13
Remember that a property (aka. getter method) is defined by adding the @property decorator to a method definition. The name of the method reflects the property name.
Duplicate the "count_to_10" counting generator from the session 3 material using the custom class definition. The constructor (__init__
) should use the default arguments of start=0 and step=1. Beside
the __iter__
and __next__
special methods the object should also have a set_step
method, so the user can adjust the step size along the way.
Below is illustrated how your iterator should behaive:
from solution_module import Count_to_10
counter = Count_to_10(start=0, step=1)
next(counter)
1
next(counter)
2
counter.set_step(2)
for count in counter:
print(count)
4 6 8 10 I can only count to 10! :(
Remember that an iterator throws a StopIteration exception when it is exhausted. This is done by raise(StopIteration)
.
The cell below is for setting the style of this document. It's not part of the exercises.
# Apply css theme to notebook
from IPython.display import HTML
HTML('<style>{}</style>'.format(open('../css/cowi.css').read()))