Note: Click on "Kernel" > "Restart Kernel and Clear All Outputs" in JupyterLab before reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it in the cloud .

Chapter 4: Recursion & Looping (continued)¶

After learning about the concept of recursion in the first part of this chapter, we look at other ways of running code repeatedly, namely looping with the for and while statements. We start with the latter as it is more generic. Throughout this second part of the chapter, we revisit the same examples from the first part to show how recursion and looping are really two sides of the same coin.

The `while` Statement¶

Whereas functions combined with if statements suffice to model any repetitive logic, Python comes with a compound while statement (cf., reference ) that often makes it easier to implement iterative ideas.

It consists of a header line with a boolean expression followed by an indented code block. Before the first and after every execution of the code block, the boolean expression is evaluated, and if it is (still) equal to True, the code block runs (again). Eventually, some variable referenced in the boolean expression is changed in the code block such that the condition becomes False.

If the condition is False before the first iteration, the entire code block is never executed. As the flow of control keeps "looping" (i.e., more formally, iterating) back to the beginning of the code block, this concept is also called a while-loop and each pass through the loop an iteration.

Trivial Example: Countdown (revisited)¶

Let's rewrite the countdown() example in an iterative style. We also build in input validation by allowing the function only to be called with strictly positive integers. As any positive integer hits $0$ at some point when iteratively decremented by $1$ , countdown() is guaranteed to terminate. Also, the base case is now handled at the end of the function, which commonly happens with iterative solutions to problems.

In [1]:

def countdown(n):
    """Print a countdown until the party starts.

    Args:
        n (int): seconds until the party begins; must be positive
    """
    while n != 0:
        print(n)
        n -= 1

    print("Happy new Year!")

In [2]:

countdown(3)

3
2
1
Happy new Year!

As PythonTutor shows, there is a subtle but essential difference in the way a while statement is treated in memory: In short, while statements can not run into a RecursionError as only one frame is needed to manage the names. After all, there is only one function call to be made. For typical day-to-day applications, this difference is, however, not so important unless a problem instance becomes so big that a large (i.e., $> 3.000$ ) number of recursive calls must be made.

"Still involved" Example: Euclid's Algorithm (revisited)

Finding the greatest common divisor of two numbers is still not so obvious when using a while-loop instead of a recursive formulation.

The iterative implementation of gcd() below accepts any two strictly positive integers. As in any iteration through the loop, the smaller number is subtracted from the larger one, the two decremented values of a and b eventually become equal. Thus, this algorithm is also guaranteed to terminate. If one of the two numbers were negative or $0$ in the first place, gcd() would run forever, and not even Python could detect this. Try this out by removing the input validation and running the function with negative arguments!

In [3]:

def gcd(a, b):
    """Calculate the greatest common divisor of two numbers.

    Args:
        a (int): first number; must be positive
        b (int): second number; must be positive

    Returns:
        gcd (int)
    """
    while a != b:
        if a > b:
            a -= b
        else:
            b -= a

    return a

In [4]:

gcd(12, 4)

Out[4]:

In [5]:

gcd(7, 7919)

Out[5]:

Efficiency of Algorithms (continued)¶

We also see that this implementation is a lot less efficient than its recursive counterpart which solves gcd() for the same two numbers 112233445566778899 and 987654321 within microseconds.

In [6]:

%%timeit -n 1 -r 1
gcd(112233445566778899, 987654321)

5.32 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

Infinite Loops¶

As with recursion, we must ensure that the iteration ends. For the above countdown() and gcd() examples, we could "prove" (i.e., at least argue in favor) that some pre-defined termination criterion is reached eventually. However, this cannot be done in all cases, as the following example shows.

"Mystery" Example: Collatz Conjecture

Let's play the following game:

Think of any positive integer $n$ .
If $n$ is even, the next $n$ is half the old $n$ .
If $n$ is odd, multiply the old $n$ by $3$ and add $1$ to obtain the next $n$ .
Repeat these steps until you reach $1$ .

Do we always reach the final $1$ ?

The function below implements this game. Does it always reach $1$ ? No one has proven it so far! We include some input validation as before because collatz() would for sure not terminate if we called it with a negative number. Further, the Collatz sequence also works for real numbers, but then we would have to study fractals (cf., this ). So we restrict our example to integers only.

In [7]:

def collatz(n):
    """Print a Collatz sequence in descending order.

    Given a positive integer n, modify it according to these rules:
        - if n is even, the next n is half the previous one
        - if n is odd, the next n is 3 times the previous one plus 1
        - if n is 1, stop the iteration

    Args:
        n (int): a positive number to start the Collatz sequence at
    """
    while n != 1:
        print(n, end=" ")
        if n % 2 == 0:
            n //= 2  # //= to preserve the int type
        else:
            n = 3 * n + 1

    print(1)

Collatz sequences do not necessarily become longer with a larger initial n.

In [8]:

collatz(100)

100 50 25 76 38 19 58 29 88 44 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1

In [9]:

collatz(1000)

1000 500 250 125 376 188 94 47 142 71 214 107 322 161 484 242 121 364 182 91 274 137 412 206 103 310 155 466 233 700 350 175 526 263 790 395 1186 593 1780 890 445 1336 668 334 167 502 251 754 377 1132 566 283 850 425 1276 638 319 958 479 1438 719 2158 1079 3238 1619 4858 2429 7288 3644 1822 911 2734 1367 4102 2051 6154 3077 9232 4616 2308 1154 577 1732 866 433 1300 650 325 976 488 244 122 61 184 92 46 23 70 35 106 53 160 80 40 20 10 5 16 8 4 2 1

In [10]:

collatz(10000)

10000 5000 2500 1250 625 1876 938 469 1408 704 352 176 88 44 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1

In [11]:

collatz(100000)

100000 50000 25000 12500 6250 3125 9376 4688 2344 1172 586 293 880 440 220 110 55 166 83 250 125 376 188 94 47 142 71 214 107 322 161 484 242 121 364 182 91 274 137 412 206 103 310 155 466 233 700 350 175 526 263 790 395 1186 593 1780 890 445 1336 668 334 167 502 251 754 377 1132 566 283 850 425 1276 638 319 958 479 1438 719 2158 1079 3238 1619 4858 2429 7288 3644 1822 911 2734 1367 4102 2051 6154 3077 9232 4616 2308 1154 577 1732 866 433 1300 650 325 976 488 244 122 61 184 92 46 23 70 35 106 53 160 80 40 20 10 5 16 8 4 2 1

The `for` Statement¶

Recursion and the while statement are two sides of the same coin. Disregarding that in the case of recursion Python internally faces some additional burden for managing the stack of frames in memory, both approaches lead to the same computational steps in memory. More importantly, we can formulate any recursive implementation in an iterative way and vice versa despite one of the two ways often "feeling" a lot more natural given a particular problem.

So how does the compound for statement (cf., reference ) in this book's very first example fit into this picture? It is a redundant language construct to provide a shorter and more convenient syntax for common applications of the while statement. In programming, such additions to a language are called syntactic sugar. A cup of tea tastes better with sugar, but we may drink tea without sugar too.

Consider elements below. Without the for statement, we must manage a temporary index variable, index, to loop over all the elements and also obtain the individual elements with the [] operator in each iteration of the loop.

In [12]:

elements = [0, 1, 2, 3, 4]

In [13]:

index = 0

while index < len(elements):
    element = elements[index]
    print(element, end=" ")
    index += 1

del index

0 1 2 3 4

The for statement, on the contrary, makes the actual business logic more apparent by stripping all the boilerplate code away. The variable that is automatically set by Python in each iteration of the loop (i.e., element in the example) is called the target variable.

In [14]:

for element in elements:
    print(element, end=" ")

0 1 2 3 4

For sequences of integers, the range() built-in makes the for statement even more convenient: It creates a list-like object of type range that generates integers "on the fly," and we look closely at the underlying effects in memory in Chapter 8 .

In [15]:

for element in range(5):
    print(element, end=" ")

0 1 2 3 4

In [16]:

type(range(5))

Out[16]:

range

range() takes optional start and step arguments that we use to customize the sequence of integers even more.

In [17]:

for element in [1, 3, 5, 7, 9]:
    print(element, end=" ")

1 3 5 7 9

In [18]:

for element in range(1, 10, 2):
    print(element, end=" ")

1 3 5 7 9

Containers vs. Iterables¶

The essential difference between the above list objects, [0, 1, 2, 3, 4] and [1, 3, 5, 7, 9], and the range objects, range(5) and range(1, 10, 2), is that in the former case six objects are created in memory before the for statement starts running, one list holding references to five int objects, whereas in the latter case only one range object is created that generates int objects one at a time while the for-loop runs.

However, we can loop over both of them. So a natural question to ask is why Python treats objects of different types in the same way when used with a for statement.

So far, the overarching storyline in this book goes like this: In Python, everything is an object. Besides its identity and value, every object is characterized by "belonging" to one data type that determines how the object behaves and what we may do with it.

Now, just as we classify objects by data type, we also classify these data types (e.g., int, float, str, or list) into abstract concepts.

We did this already in Chapter 1 when we described a list object as "some sort of container that holds [...] references to other objects". So, abstractly speaking, containers are any objects that are "composed" of other objects and also "manage" how these objects are organized. list objects, for example, have the property that they model an order associated with their elements. There exist, however, other container types, many of which do not come with an order. So, containers primarily "contain" other objects and have nothing to do with looping.

On the contrary, the abstract concept of iterables is all about looping: Any object that we can loop over is, by definition, an iterable. So, range objects, for example, are iterables, even though they hold no references to other objects. Moreover, looping does not have to occur in a predictable order, although this is the case for both list and range objects.

Typically, containers are iterables, and iterables are containers. Yet, only because these two concepts coincide often, we must not think of them as the same. In Chapter 7 , we formalize these two concepts and introduce many more. Finally, Chapter 11 gives an explanation how abstract concepts are implemented and play together.

Let's continue with first_names below as an example an illustrate what iterable containers are.

In [19]:

first_names = ["Achim", "Berthold", "Carl", "Diedrich", "Eckardt"]

The characteristic operator associated with container types is the in operator: It checks if a given object evaluates equal to at least one of the objects in the container. Colloquially, it checks if an object is "contained" in the container. Formally, this operation is called membership testing.

In [20]:

"Achim" in first_names

Out[20]:

True

In [21]:

"Alexander" in first_names

Out[21]:

False

The cell below shows the exact workings of the in operator: Although 3.0 is not contained in elements, it evaluates equal to the 3 that is, which is why the following expression evaluates to True. So, while we could colloquially say that elements "contains" 3.0, it actually does not.

In [22]:

elements

Out[22]:

[0, 1, 2, 3, 4]

In [23]:

3.0 in elements

Out[23]:

True

Similarly, the characteristic operation of an iterable type is that it supports being looped over, for example, with the for statement.

In [24]:

for name in first_names:
    print(name, end="   ")

Achim   Berthold   Carl   Diedrich   Eckardt

If we must have an index variable in the loop's body, we use the enumerate() built-in that takes an iterable as its argument and then generates a "stream" of "pairs" of an index variable, i below, and an object provided by the iterable, name, separated by a ,. There is no need to ever revert to the while statement with an explicitly managed index variable to loop over an iterable object.

In [25]:

for i, name in enumerate(first_names, start=1):
    print(i, ">", name, end="   ")

1 > Achim   2 > Berthold   3 > Carl   4 > Diedrich   5 > Eckardt

enumerate() takes an optional start argument.

In [26]:

for i, name in enumerate(first_names, start=1):
    print(i, ">", name, end="   ")

1 > Achim   2 > Berthold   3 > Carl   4 > Diedrich   5 > Eckardt

The zip() built-in allows us to combine the elements of two or more iterables in a pairwise fashion: It conceptually works like a zipper for a jacket.

In [27]:

last_names = ["Müller", "Meyer", "Mayer", "Schmitt", "Schmidt"]

In [28]:

for first_name, last_name in zip(first_names, last_names):
    print(first_name, last_name, end="   ")

Achim Müller   Berthold Meyer   Carl Mayer   Diedrich Schmitt   Eckardt Schmidt

"Hard at first Glance" Example: Fibonacci Numbers (revisited)

In contrast to its recursive counterpart, the iterative fibonacci() function below is somewhat harder to read. For example, it is not so obvious as to how many iterations through the for-loop we need to make when implementing it. There is an increased risk of making an off-by-one error. Moreover, we need to track a temp variable along.

However, one advantage of calculating Fibonacci numbers in a forward fashion with a for statement is that we could list the entire sequence in ascending order as we calculate the desired number. To show this, we added print() statements in fibonacci() below.

We do not need to store the index variable in the for-loop's header line: That is what the underscore variable _ indicates; we "throw it away."

In [29]:

def fibonacci(i):
    """Calculate the ith Fibonacci number.

    Args:
        i (int): index of the Fibonacci number to calculate

    Returns:
        ith_fibonacci (int)
    """
    a = 0
    b = 1
    print(a, b, sep=" ", end=" ")  # added for didactical purposes
    for _ in range(i - 1):
        temp = a + b
        a = b
        b = temp
        print(b, end=" ")  # added for didactical purposes

    return b

In [30]:

fibonacci(12)  # = 13th number

0 1 1 2 3 5 8 13 21 34 55 89 144

Out[30]:

Efficiency of Algorithms (continued)¶

Another more important advantage is that now we may calculate even big Fibonacci numbers efficiently.

In [31]:

fibonacci(99)  # = 100th number

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352 24157817 39088169 63245986 102334155 165580141 267914296 433494437 701408733 1134903170 1836311903 2971215073 4807526976 7778742049 12586269025 20365011074 32951280099 53316291173 86267571272 139583862445 225851433717 365435296162 591286729879 956722026041 1548008755920 2504730781961 4052739537881 6557470319842 10610209857723 17167680177565 27777890035288 44945570212853 72723460248141 117669030460994 190392490709135 308061521170129 498454011879264 806515533049393 1304969544928657 2111485077978050 3416454622906707 5527939700884757 8944394323791464 14472334024676221 23416728348467685 37889062373143906 61305790721611591 99194853094755497 160500643816367088 259695496911122585 420196140727489673 679891637638612258 1100087778366101931 1779979416004714189 2880067194370816120 4660046610375530309 7540113804746346429 12200160415121876738 19740274219868223167 31940434634990099905 51680708854858323072 83621143489848422977 135301852344706746049 218922995834555169026

Out[31]:

218922995834555169026

Easy Example: Factorial (revisited)

The iterative factorial() implementation is comparable to its recursive counterpart when it comes to readability. One advantage of calculating the factorial in a forward fashion is that we could track the intermediate product as it grows.

In [32]:

def factorial(n):
    """Calculate the factorial of a number.

    Args:
        n (int): number to calculate the factorial for, must be positive

    Returns:
        factorial (int)
    """
    product = 1  # because 0! = 1
    for i in range(1, n + 1):
        product *= i
        print(product, end=" ")  # added for didactical purposes

    return product

In [33]:

factorial(3)

1 2 6

Out[33]:

In [34]:

factorial(10)

1 2 6 24 120 720 5040 40320 362880 3628800

Out[34]:

Chapter 4: Recursion & Looping (continued)¶

The while Statement¶

Trivial Example: Countdown (revisited)¶

"Still involved" Example: Euclid's Algorithm (revisited)

Efficiency of Algorithms (continued)¶

Infinite Loops¶

"Mystery" Example: Collatz Conjecture

The for Statement¶

Containers vs. Iterables¶

"Hard at first Glance" Example: Fibonacci Numbers (revisited)

Efficiency of Algorithms (continued)¶

Easy Example: Factorial (revisited)

The `while` Statement¶

The `for` Statement¶