Note: Click on "Kernel" > "Restart Kernel and Clear All Outputs" in JupyterLab before reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it in the cloud .
In this third part of the chapter, we first look at a major implication of the list
type's mutability. Then, we see how its close relative, the tuple
type, can mitigate this. Lastly, we see how Python's syntax assumes sequential data at various places: for example, when unpacking iterables during a for
-loop or an assignment, or when working with function
objects.
As list
objects are mutable, the caller of a function can see the changes made to a list
object passed to the function as an argument. That is often a surprising side effect and should be avoided.
As an example, consider the add_xyz()
function.
letters = ["a", "b", "c"]
def add_xyz(arg):
"""Append letters to a list."""
arg.extend(["x", "y", "z"])
return arg
While this function is being executed, two variables, namely letters
in the global scope and arg
inside the function's local scope, reference the same list
object in memory. Furthermore, the passed in arg
is also the return value.
So, after the function call, letters_with_xyz
and letters
are aliases as well, referencing the same object. We can also visualize that with PythonTutor .
letters_with_xyz = add_xyz(letters)
letters_with_xyz
['a', 'b', 'c', 'x', 'y', 'z']
letters
['a', 'b', 'c', 'x', 'y', 'z']
A better practice is to first create a copy of arg
within the function that is then modified and returned. If we are sure that arg
contains immutable elements only, we get away with a shallow copy. The downside of this approach is the higher amount of memory necessary.
The revised add_xyz()
function below is more natural to reason about as it does not modify the passed in arg
internally. PythonTutor shows that as well. This approach is following the functional programming
paradigm that is going through a "renaissance" currently. Two essential characteristics of functional programming are that a function never changes its inputs and always returns the same output given the same inputs.
For a beginner, it is probably better to stick to this idea and not change any arguments as the original add_xyz()
above. However, functions that modify and return the argument passed in are an important aspect of object-oriented programming, as explained in Chapter 11 .
letters = ["a", "b", "c"]
def add_xyz(arg):
"""Create a new list from an existing one."""
new_arg = arg[:]
new_arg.extend(["x", "y", "z"])
return new_arg
letters_with_xyz = add_xyz(letters)
letters_with_xyz
['a', 'b', 'c', 'x', 'y', 'z']
letters
['a', 'b', 'c']
If we want to modify the argument passed in, it is best to return None
and not arg
, as does the final version of add_xyz()
below. Then, the user of our function cannot accidentally create two aliases to the same object. That is also why the list methods above all return None
. PythonTutor shows how there is only one reference to
letters
after the function call.
letters = ["a", "b", "c"]
def add_xyz(arg):
"""Append letters to a list."""
arg.extend(["x", "y", "z"])
return # None
add_xyz(letters)
letters
['a', 'b', 'c', 'x', 'y', 'z']
If we call add_xyz()
with letters
as the argument again, we end up with an even longer list
object.
add_xyz(letters)
letters
['a', 'b', 'c', 'x', 'y', 'z', 'x', 'y', 'z']
Functions that only work on the argument passed in are called modifiers. Their primary purpose is to change the state of the argument. On the contrary, functions that have no side effects on the arguments are said to be pure.
tuple
Type¶To create a tuple
object, we can use the same literal notation as for list
objects without the brackets and list all elements.
numbers = 7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4
numbers
(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
However, to be clearer, many Pythonistas write out the optional parentheses (
and )
.
numbers = (7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
numbers
(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
As before, numbers
is an object on its own.
id(numbers)
140248673535456
type(numbers)
tuple
While we could use empty parentheses ()
to create an empty tuple
object ...
empty_tuple = ()
empty_tuple
()
type(empty_tuple)
tuple
... we must use a trailing comma to create a tuple
object holding one element. If we forget the comma, the parentheses are interpreted as the grouping operator and effectively useless!
one_tuple = (1,) # we could ommit the parentheses but not the comma
one_tuple
(1,)
type(one_tuple)
tuple
no_tuple = (1)
no_tuple
1
type(no_tuple)
int
Alternatively, we may use the tuple() built-in that takes any iterable as its argument and creates a new
tuple
from its elements.
tuple([1])
(1,)
tuple("iterable")
('i', 't', 'e', 'r', 'a', 'b', 'l', 'e')
Most operations involving tuple
objects work in the same way as with list
objects. The main difference is that tuple
objects are immutable. So, if our program does not depend on mutability, we may and should use tuple
and not list
objects to model sequential data. That way, we avoid the pitfalls seen above.
tuple
objects are sequences exhibiting the familiar four behaviors. So, numbers
holds a finite number of elements ...
len(numbers)
12
... that we can obtain individually by looping over it in a predictable forward or reverse order.
for number in numbers:
print(number, end=" ")
7 11 8 5 3 12 2 6 9 10 1 4
for number in reversed(numbers):
print(number, end=" ")
4 1 10 9 6 2 12 3 5 8 11 7
To check if a given object is contained in numbers
, we use the in
operator and conduct a linear search.
0 in numbers
False
1 in numbers
True
1.0 in numbers # in relies on == behind the scenes
True
We may index and slice with the []
operator. The latter returns new tuple
objects.
numbers[0]
7
numbers[-1]
4
numbers[6:]
(2, 6, 9, 10, 1, 4)
Index assignment does not work as tuples are immutable and results in a TypeError
.
numbers[-1] = 99
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[43], line 1 ----> 1 numbers[-1] = 99 TypeError: 'tuple' object does not support item assignment
The +
and *
operators work with tuple
objects as well: They always create new tuple
objects.
numbers + (99,)
(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4, 99)
2 * numbers
(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4, 7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
Being immutable, tuple
objects only provide the .count()
and .index()
methods of Sequence
types. The .append()
, .extend()
, .insert()
, .reverse()
, .pop()
, and .remove()
methods of MutableSequence
types are not available. The same holds for the list
-specific .sort()
, .copy()
, and .clear()
methods.
numbers.count(0)
0
numbers.index(1)
10
The relational operators work in the same way as for list
objects.
numbers
(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
numbers == (7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
True
numbers != (99, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
True
numbers < (99, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)
True
While tuple
objects are immutable, this only relates to the references they hold. If a tuple
object contains references to mutable objects, the entire nested structure is not immutable as a whole!
Consider the following stylized example not_immutable
: It contains three elements, 1
, [2, ..., 11]
, and 12
, and the elements of the nested list
object may be changed. While it is not practical to mix data types in a tuple
object that is used as an "immutable list," we want to make the point that the mere usage of the tuple
type does not guarantee a nested object to be immutable as a whole.
not_immutable = (1, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 12)
not_immutable
(1, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 12)
not_immutable[1][:] = [99, 99, 99]
not_immutable
(1, [99, 99, 99], 12)
In the "List Operations" section in the second part of this chapter, the
*
symbol unpacks the elements of a list
object into another one. This idea of iterable unpacking is built into Python at various places, even without the *
symbol.
For example, we may write variables on the left-hand side of a =
statement in a literal tuple
style. Then, any finite iterable on the right-hand side is unpacked. So, numbers
is unpacked into twelve variables below.
n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12 = numbers
n1
7
n2
11
n3
8
Having to type twelve variables on the left is already tedious. Furthermore, if the iterable on the right yields a number of elements different from the number of variables, we get a ValueError
.
n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11 = numbers
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[60], line 1 ----> 1 n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11 = numbers ValueError: too many values to unpack (expected 11)
n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12, n13 = numbers
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[61], line 1 ----> 1 n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12, n13 = numbers ValueError: not enough values to unpack (expected 13, got 12)
So, to make iterable unpacking useful, we prepend the *
symbol to one of the variables on the left: That variable then becomes a list
object holding the elements not captured by the other variables. We say that the excess elements from the iterable are packed into this variable.
For example, let's get the first
and last
element of numbers
and collect the rest in middle
.
first, *middle, last = numbers
first
7
middle # always a list!
[11, 8, 5, 3, 12, 2, 6, 9, 10, 1]
last
4
We already used unpacking before this section without knowing it. Whenever we write a for
-loop over the zip() built-in, that generates a new
tuple
object in each iteration that we unpack by listing several loop variables.
So, the name, position
below acts like a left-hand side of an =
statement and unpacks the tuple
objects generated from "zipping" the names
list and the positions
tuple together.
names = ["Berthold", "Oliver", "Carl"]
positions = ("goalkeeper", "defender", "midfielder", "striker", "coach")
for name, position in zip(names, positions):
print(name, "is a", position)
Berthold is a goalkeeper Oliver is a defender Carl is a midfielder
Without unpacking, zip() generates a series of
tuple
objects.
for pair in zip(names, positions):
print(type(pair), pair, sep=" ")
<class 'tuple'> ('Berthold', 'goalkeeper') <class 'tuple'> ('Oliver', 'defender') <class 'tuple'> ('Carl', 'midfielder')
Unpacking also works for nested objects. Below, we wrap zip() with the enumerate()
built-in to have an index variable
number
inside the for
-loop. In each iteration, a tuple
object consisting of number
and another tuple
object is created. The inner one then holds the name
and position
.
for number, (name, position) in enumerate(zip(names, positions), start=1):
print(f"{name} (jersey #{number}) is a {position}")
Berthold (jersey #1) is a goalkeeper Oliver (jersey #2) is a defender Carl (jersey #3) is a midfielder
A popular use case of unpacking is swapping two variables.
Consider a
and b
below.
a = 0
b = 1
Without unpacking, we must use a temporary variable temp
to swap a
and b
.
temp = a
a = b
b = temp
del temp
a
1
b
0
With unpacking, the solution is more elegant. All expressions on the right-hand side are evaluated before any assignment takes place.
a, b = 0, 1
a, b = b, a
a, b
(1, 0)
Unpacking allows us to rewrite the iterative fibonacci()
function from Chapter 4 in a concise way.
def fibonacci(i):
"""Calculate the ith Fibonacci number.
Args:
i (int): index of the Fibonacci number to calculate
Returns:
ith_fibonacci (int)
"""
a, b = 0, 1
for _ in range(i - 1):
a, b = b, a + b
return b
fibonacci(12)
144
The concepts of packing and unpacking are also helpful when writing and using functions.
For example, let's look at the product()
function below. Its implementation suggests that args
must be a sequence type. Otherwise, it would not make sense to index into it with [0]
or take a slice with [1:]
. In line with the function's name, the for
-loop multiplies all elements of the args
sequence. So, what does the *
do in the header line, and what is the exact data type of args
?
The *
is again not an operator in this context but a special syntax that makes Python pack all positional arguments passed to product()
into a single tuple
object called args
.
def product(*args):
"""Multiply all arguments."""
result = args[0]
for arg in args[1:]:
result *= arg
return result
So, we can pass an arbitrary (i.e., also none) number of positional arguments to product()
.
The product of just one number is the number itself.
product(42)
42
Passing in several numbers works as expected.
product(2, 5, 10)
100
However, this implementation of product()
needs at least one argument passed in due to the expression args[0]
used internally. Otherwise, we see a runtime error, namely an IndexError
. We emphasize that this error is not caused in the header line.
product()
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[83], line 1 ----> 1 product() Cell In[80], line 3, in product(*args) 1 def product(*args): 2 """Multiply all arguments.""" ----> 3 result = args[0] 5 for arg in args[1:]: 6 result *= arg IndexError: tuple index out of range
Another downside of this implementation is that we can easily generate semantic errors: For example, if we pass in an iterable object like the one_hundred
list, no exception is raised. However, the return value is also not a numeric object as we expect. The reason for this is that during the function call, args
becomes a tuple
object holding one element, which is one_hundred
, a list
object. So, we created a nested structure by accident.
one_hundred = [2, 5, 10]
product(one_hundred) # a semantic error!
[2, 5, 10]
This error does not occur if we unpack one_hundred
upon passing it as the argument.
product(*one_hundred)
100
That is the equivalent of writing out the following tedious expression. Yet, that does not scale for iterables with many elements in them.
product(one_hundred[0], one_hundred[1], one_hundred[2])
100
In the "Packing & Unpacking with Functions" exercise , we look at
product()
in more detail.
While we needed to unpack one_hundred
above to avoid the semantic error, unpacking an argument in a function call may also be a convenience in general. For example, to print the elements of one_hundred
in one line, we need to use a for
statement, until now. With unpacking, we get away without a loop.
print(one_hundred) # prints the tuple; we do not want that
[2, 5, 10]
for number in one_hundred:
print(number, end=" ")
2 5 10
print(*one_hundred) # replaces the for-loop
2 5 10