Note: Click on "Kernel" > "Restart Kernel and Clear All Outputs" in JupyterLab before reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it in the cloud .
While what we learned about the for
and while
statements in the second part of this chapter suffices to translate any iterative algorithm into code, both come with some syntactic sugar to make life easier for the developer. This last part of the chapter shows how we can further customize the looping logic and introduces as "trick" for situations where we cannot come up with a stopping criterion in a
while
-loop.
This section introduces additional syntax to customize for
and while
statements in our code even further. They are mostly syntactic sugar in that they do not change how a program runs but make its code more readable. We illustrate them for the for
statement only. However, everything presented in this section also works for the while
statement.
[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]
greater than 100
?¶Let's say we have a list of numbers
and want to check if the square of at least one of its elements is greater than 100
. So, conceptually, we are asking the question if a list of numbers as a whole satisfies a certain condition.
numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]
A first naive implementation could look like this: We loop over every element in numbers
and set an indicator variable is_above
, initialized as False
, to True
once we encounter an element satisfying the condition.
This implementation is inefficient as even if the first element in numbers
has a square greater than 100
, we loop until the last element: This could take a long time for a big list.
Moreover, we must initialize is_above
before the for
-loop and write an if
-else
-logic after it to check for the result. The actual business logic is not conveyed in a clear way.
is_above = False
for number in numbers:
print(number, end=" ") # added for didactical purposes
if number ** 2 > 100:
is_above = True
if is_above:
print("=> at least one number satisfies the condition")
else:
print("=> no number satisfies the condition")
7 11 8 5 3 12 2 6 9 10 1 4 => at least one number satisfies the condition
break
Statement¶Python provides the break
statement (cf., reference ) that lets us stop a loop prematurely at any iteration. It is yet another means of controlling the flow of execution, and we say that we "break out of a loop."
is_above = False
for number in numbers:
print(number, end=" ") # added for didactical purposes
if number ** 2 > 100:
is_above = True
break
if is_above:
print("=> at least one number satisfies the condition")
else:
print("=> no number satisfies the condition")
7 11 => at least one number satisfies the condition
This is a computational improvement. However, the code still consists of three sections: Some initialization before the for
-loop, the loop itself, and some finalizing logic. We prefer to convey the program's idea in one compound statement instead.
else
-clause¶To express the logic in a prettier way, we add an else
-clause at the end of the for
-loop (cf., reference ). The
else
-clause is executed only if the for
-loop is not stopped with a break
statement prematurely (i.e., before reaching the last iteration in the loop). The word "else" implies a somewhat unintuitive meaning and may have better been named a then
-clause. In most use cases, however, the else
-clause logically goes together with some if
statement in the loop's body.
Overall, the code's expressive power increases. Not many programming languages support an optional else
-branching for the for
and while
statements, which turns out to be very useful in practice.
for number in numbers:
print(number, end=" ") # added for didactical purposes
if number ** 2 > 100:
is_above = True
break
else:
is_above = False
if is_above:
print("=> at least one number satisfies the condition")
else:
print("=> no number satisfies the condition")
7 11 => at least one number satisfies the condition
Lastly, we incorporate the finalizing if
-else
logic into the for
-loop, avoiding the is_above
variable altogether.
for number in numbers:
print(number, end=" ") # added for didactical purposes
if number ** 2 > 100:
print("=> at least one number satisfies the condition")
break
else:
print("=> no number satisfies the condition")
7 11 => at least one number satisfies the condition
Of course, if we choose the number an element's square has to pass to be larger, for example, to 200
, we have to loop over all numbers
. There is no way to optimize this linear search further.
for number in numbers:
print(number, end=" ") # added for didactical purposes
if number ** 2 > 200:
print("=> at least one number satisfies the condition")
break
else:
print("=> no number satisfies the condition")
7 11 8 5 3 12 2 6 9 10 1 4 => no number satisfies the condition
Often, we process some iterable with numeric data, for example, a list of numbers
as in this book's introductory example in Chapter 1 or, more realistically, data from a CSV file with many rows and columns.
Processing numeric data usually comes down to operations that may be grouped into one of the following three categories:
We study this map-filter-reduce paradigm extensively in Chapter 8 after introducing more advanced data types that are needed to work with "big" data.
Here, we focus on filtering out some numbers in a for
-loop.
Calculate the sum of all even numbers in [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]
after squaring them and adding 1
to the squares:
numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]
total = 0
for number in numbers:
if number % 2 == 0: # only keep even numbers
square = (number ** 2) + 1
print(number, "->", square, end=" ") # added for didactical purposes
total += square
total
8 -> 65 12 -> 145 2 -> 5 6 -> 37 10 -> 101 4 -> 17
370
The above code is easy to read as it involves only two levels of indentation.
In general, code gets harder to comprehend the more horizontal space it occupies. It is commonly considered good practice to grow a program vertically rather than horizontally. Code compliant with PEP 8 requires us to use at most 79 characters in a line!
Consider the next example, whose implementation in code already starts to look unbalanced.
Calculate the sum of every third and even number in [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]
after squaring them and adding 1
to the squares:
total = 0
for i, number in enumerate(numbers, start=1):
if i % 3 == 0: # only keep every third number
if number % 2 == 0: # only keep even numbers
square = (number ** 2) + 1
print(number, "->", square, end=" ") # added for didactical purposes
total += square
total
8 -> 65 12 -> 145 4 -> 17
227
With already three levels of indentation, less horizontal space is available for the actual code block. Of course, one could flatten the two if
statements with the logical and
operator, as shown in Chapter 3 . Then, however, we trade off horizontal space against a more "complex"
if
logic, and this is not a real improvement.
continue
Statement¶A Pythonista would instead make use of the continue
statement (cf., reference ) that causes a loop to jump into the next iteration skipping the rest of the code block.
The revised code fragment below occupies more vertical space and less horizontal space: A good trade-off.
One caveat is that we need to negate the conditions in the if
statements. Conceptually, we are now filtering "out" and not "in."
total = 0
for i, number in enumerate(numbers, start=1):
if i % 3 != 0: # only keep every third number
continue
elif number % 2 != 0: # only keep even numbers
continue
square = (number ** 2) + 1
print(number, "->", square, end=" ") # added for didactical purposes
total += square
total
8 -> 65 12 -> 145 4 -> 17
227
This is yet another illustration of why programming is an art. The two preceding code cells do the same with identical time complexity. However, the latter is arguably easier to read for a human, even more so when the business logic grows beyond two filters.
Sometimes we find ourselves in situations where we cannot know ahead of time how often or until which point in time a code block is to be executed.
Let's consider a game where we randomly choose a variable to be either "Heads" or "Tails" and the user of our program has to guess it.
Python provides the built-in input() function that prints a message to the user, called the prompt, and reads in what was typed in response as a
str
object. We use it to process a user's "unreliable" input to our program (i.e., a user might type in some invalid response). Further, we use the random() function in the random
module to model the coin toss.
A popular pattern to approach such indefinite loops is to go with a while True
statement, which on its own would cause Python to enter into an infinite loop. Then, once a particular event occurs, we break
out of the loop.
Let's look at a first and naive implementation.
import random
random.seed(42)
while True:
guess = input("Guess if the coin comes up as heads or tails: ")
if random.random() < 0.5:
if guess == "heads":
print("Yes, it was heads")
break
else:
print("Ooops, it was heads")
else:
if guess == "tails":
print("Yes, it was tails")
break
else:
print("Ooops, it was tails")
Ooops, it was tails
Yes, it was heads
This version exhibits two severe issues where we should improve on:
"heads"
or "tails"
, for example, "Heads"
or "Tails"
, the program keeps running without the user knowing about the mistake!Let's refactor the code and make it modular.
First, we divide the business logic into two functions get_guess()
and toss_coin()
that are controlled from within a while
-loop.
get_guess()
not only reads in the user's input but also implements a simple input validation pattern in that the .strip() and .lower()
methods remove preceding and trailing whitespace and lower case the input ensuring that the user may spell the input in any possible way (e.g., all upper or lower case). Also,
get_guess()
checks if the user entered one of the two valid options. If so, it returns either "heads"
or "tails"
; if not, it returns None
.
def get_guess():
"""Process the user's input.
Returns:
guess (str / NoneType): either "heads" or "tails"
if the input can be parsed and None otherwise
"""
guess = input("Guess if the coin comes up as heads or tails: ")
# handle frequent cases of "misspelled" user input
guess = guess.strip().lower()
if guess in ["heads", "tails"]:
return guess
return None
toss_coin()
models a fair coin toss when called with default arguments.
def toss_coin(p_heads=0.5):
"""Simulate the tossing of a coin.
Args:
p_heads (optional, float): probability that the coin comes up "heads";
defaults to 0.5 resembling a fair coin
Returns:
side_on_top (str): "heads" or "tails"
"""
if random.random() < p_heads:
return "heads"
return "tails"
Second, we rewrite the if
-else
-logic to handle the case where get_guess()
returns None
explicitly: Whenever the user enters something invalid, a warning is shown, and another try is granted. We use the is
operator and not the ==
operator as None
is a singleton object.
The while
-loop takes on the role of glue code that manages how other parts of the program interact with each other.
random.seed(42)
while True:
guess = get_guess()
result = toss_coin()
if guess is None:
print("Make sure to enter your guess correctly!")
elif guess == result:
print("Yes, it was", result)
break
else:
print("Ooops, it was", result)
Make sure to enter your guess correctly!
Yes, it was heads
Now, the program's business logic is expressed in a clearer way. More importantly, we can now change it more easily. For example, we could make the toss_coin()
function base the tossing on a probability distribution other than the uniform (i.e., replace the random.random() function with another one). In general, modular architecture leads to improved software maintenance.