We talked about binary numbers (with some theory) and base conversions and implemented an algorithm for base conversion (from decimal to 2<=b<=36). We examined Python's memory model and analyzed the efficiency of constructing a list using + or += operators.
Before you click "Visualize Execution" button, you may want to use the following settings (can be adjusted via the drop boxes next to the textbox):
When we count natural numbers, we use the decimal system. That is, we represent numbers in base 10, using 10 digits (0-9). Binary representation (base 2) uses two possible digits: 0-1. In general, a number in base $b$ will be represented using $b$ possible digits.
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
Python has some built-in conversion methods from-and-to decimal, binary and other bases:
bin(10)
type(bin(10))
hex(161)
int("345")
int("1101", 2)
int("a1", 16)
int("a1", 2)
Let $x_{[2]} = a_{n-1} ... a_1 a_0$ be the binary representation of a number with $n$ binary digits. The following formula returns its decimal representation: $$x_{[10]} = \sum_{0 \le k \le n-1} a_k \cdot 2^k$$
Similarly, if $x_{[b]} = a_{n-1} ... a_0$ is represented in an arbitrary base with $b$ digits then: $$x_{[10]} = \sum_{0 \le k \le n-1} a_k \cdot b^k$$
Converting from decimal to binary is done by integer division and modulo operations. Here is an implementation of the algorithm for conversion from decimal to a general base $b$.
def convert_base(n,b):
'''convert_base(int, int)->string
Returns the textual representation
of n(decimal) in base 2<=b<=10
'''
result = ""
while n != 0:
digit = n % b
n //= b
result = str(digit) + result
return result
convert_base(10,2)
convert_base(0, 2)
Is the code above correct? What happens in the case where $n = 0$?
def convert_base(n,b):
'''convert_base(int, int)->string
Returns the textual representation
of n(decimal) in base 2<=b<=10
'''
if n == 0:
return "0"
result = ""
while n != 0:
digit = n % b
n //= b
result = str(digit) + result
return result
convert_base(0,2)
convert_base(12,3)
convert_base(161,16)
The improved version above fails for $10<b$ (why ?) This issue is fixed in the final version below (improvement #2) (what happens when $b = 1$?)
alphabet = "0123456789abcdefghijklmnopqrstuvwxyz"
def convert_base(n,b):
'''convert_base(int, int)->string
Returns the textual representation
of n(decimal) in base 2<=b<=10
'''
assert 2 <= b <= 36
if n == 0:
return "0"
result = ""
alphabet = "0123456789abcdefghijklmnopqrstuvwxyz"
while n != 0:
## Complete
return result
convert_base(161, 16)
convert_base(10, 2)
convert_base(10, 80)
Understanding Python's memory model means understanding the way Python stores our variables in the memory and the way Python manipulates these variables when we perform operations on them.
Memory management depends on various properties, including:
Understanding this model will allow us to write better and more efficient code.
def change_num(num):
num = 999
x = 30
change_num(x)
x
def add_to_list(lst):
lst.append(999)
mylst = [1,2,3]
add_to_list(mylst)
mylst
def change_lst(lst):
lst = []
mylst = [1,2,3]
change_lst(mylst)
mylst
#1 basics #####################################################
a = 1000 #1
b = "hello" #2
a += len(b) #3
c = 2*b[:2] #4
if b!=c: #5
c = b #6
del c #7
#2 lists #####################################################
a = 1000 #1
d = [a,2] #2
d[1] = -1 #3
a = 1003 #4
for x in d: #5a
x = 7 #5b
#3 funcs #####################################################
a = 1000 #1
b = "hello" #2
def is_palindrom(a): #3a
b = a[::-1] #3b
return a==b #3c
is_pal = is_palindrom #4
x = is_pal(b) #5
#4 lists+funcs #####################################################
a = 1000 #1
d = [a,2] #2
def f(a,d): #3a
a = 2000 #3b
d[0] = a #3c
d = [] #3d
return d #3e
x = f(a,d) #4
Extend(), append(), sort(), and sorted():
#5 – lists operators and methods #####################################################
lst = [3,2,1]
id(lst)
lst
lst = lst+[0]
id(lst)
lst
lst.append(4)
id(lst)
lst
lst.insert(1,5)
id(lst)
lst
lst.extend([6,7])
id(lst)
lst
lst += [8]
id(lst)
lst
lst
id(lst)
#sorted returns a sorted copy of the list, and does not change the original one
lst2 = sorted(lst)
id(lst)
lst
#list.sort() sorts in place, and does not return anything
lst.sort()
id(lst)
lst
In what follows we have two different implementations of the same operation: given an input $n$ we construct a list of the range $[0,...,n-1]$. The difference between the implementation is that in build_range1 we append items to our list using the command $lst = lst + [i]$ and in build_range2 we do this via the command $lst += [i]$, which, as we recall, invokes $lst.extend([i])$.
import time
def build_range1(n):
lst = []
t0 = time.perf_counter()
for i in range(n):
lst = lst + [i]
t1 = time.perf_counter()
print("Extending a list n=", n, "times, using the + operator took",t1-t0, "seconds")
def build_range2(n):
lst = []
t0 = time.perf_counter()
for i in range(n):
lst += [i]
t1 = time.perf_counter()
print("Extending a list n=", n, "times, using the += operator (extend()) took",t1-t0, "seconds")
build_range1(20000)
build_range2(20000)
Empirically, calling $lst = lst + [i]$ is less efficient. Why is that?
Since a new list is constructed every iteration the number of integers written to memory equals $1+2+ ... + n = \frac{n \cdot (n+1)}{2} \approx \frac{n^2}{2}$. Asymptotically speaking, the number of copy operations grows quadratically with $n$.
Using invoke, the same list is extended again and again. The number of integers written to memory equals $n$. Asymptotically speaking, the number of copy operations grows linearly with $n$.
What do we expect will happen if we double the input size $n=40000$?
build_range1(40000)
build_range2(40000)
In accordance with the theory, the running time of the inefficient code increases by approximately $2^2 = 4$ as $n$ increased by a factor of $2$, while the running time of the efficient code increases by a factor of approximately $2$.