We continued discussing recursion. We also discussed memoization and demonstrated it.
and finally changing your recursive implementation so that it will search for the key in the dictionary before making the recursive calls, and save the key:value pair after obtaining the value for a certain input.
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
A bus driver needs to give an exact change and she has coins of limited types. She has infinite coins of each type. Given the amount of change ($amount$) and the coin types (the list $coins$), how many ways are there?
def change(amount, coins):
if amount == 0:
return 1
elif amount < 0 or coins == []:
return 0
return change(amount, coins[:-1]) +\
change(amount - coins[-1], coins)
change(5, [1,2,3])
5
Why is it counting unique solutions? For example, why is [2,2,1] counted once and not [1,2,2]?
Consider the case where $amount = n^2, coins = [1,2,\ldots, n]$. When we call $change(amount, coins)$ the first level of the recursion calls $change([1,\ldots, n-1], n^2)$ and $change([1,\ldots, n], n^2 - n)$. This means that the list size in the recusrive calls is at least $n-1$ and $amount \geq n^2 -n$.
One can show (using induction) that in the first $k \leq n$ levels of recursion the list size is at least $n - k$ and $amount \geq n^2-n k$, thus there are two recursive calls at each of these layers.
This gives a running time of at least $2^n$ by the same argument as that we applied in the subset sum problem.
It is interesting to note that while the two problems share some similarities, one major difference is that in the subset sum problem we are asking whether a solution exists while in the change problem we are trying to count the number of valid solutions to a problem.
This is a recurring theme in CS that we will encounter many times in the future.
Question 2(a) from the 2015 fall semester exam (Moed B).
Given a list $L$ of non-negative integers with $len(L) = d$ we consider $L$ as a point in a $d$-dimensional space.
For example: $L = [0, 0, 0]$ represents the origin in a $3$-dimensional space.
Our goal is to find how many ways we have of getting from $[0, \ldots, 0]$ to $L$ by advancing only forward.
For example, if $L=[1, 2]$ then we have three such paths:
Again, we first think of the base case, and then reduce big problems to smaller ones.
This gives rise to a simple recursive algorithm:
def cnt_paths(L):
if all_zeros(L):
return 1
result = 0
for i in range(len(L)):
if L[i] != 0:
L[i] -= 1
result += cnt_paths(L)
L[i] += 1
return result
def all_zeros(L):
for i in L:
if i != 0:
return False
return True
print(cnt_paths([3,4,5]))
27720
Note that the leaves in the tree are exactly the "legal paths" which we count.
Let $cnt$ be the returned value of $cnt\_paths(L)$. Since we increment $cnt$ by $1$ for each legal path, this means that the running time is at least $cnt$.
Using the following combinatorial analysis, which is for reference only, we show that $cnt$ is exponential in $d,n$, implying that the running time of $cnt\_paths(L)$ is at least exponential in $d,n$. Can we do better than the recursive solution in terms of running time? ...yes!
Let's take a simple case where $L = [n, n, \ldots, n]$ and $|L| = d$. I.e. - we are in a $d$-dimensional space and we want to get to $[n, \ldots, n]$.
Can you think of a combinatorial solution for cnt_paths? Let's take the case above where $L = [n, n, \ldots, n]$ and $|L| = d$.
In each step we subtract $1$ from one of the $d$ coordinates (which is currently positive) and in exactly $nd$ steps we need to get to the all-zero vector.
Think about the first coordinate - there are precisely $n$ steps along our path where we change this coordinate, thus we have $\binom{nd}{n}$ options to choose where the moves for the first coordinate are located.
What about the second coordainte? Well now we are left with $nd - n = n(d-1)$ places out of which we again pick $n$ places to advance the second coordinate. By indcution, we get: $$cnt = \prod_{i=1}^d \tbinom{n(d-i+1)}{n} = \prod_{i=1}^d \tbinom{ni}{n} $$
How long does it take to compute this number? We need to multiply $d$ elements, and for each of those we need to compute factorials and divide numbers in the range of $1,\ldots,n$. This is clearly done in time polynomial in $d,n$.
And how big is $cnt$ exactly? Recall that $cnt$ is a bound on the running time of cnt_paths.
We now claim that $cnt = exp(n,d)$. To see this, we write it explicitly:
$$cnt = \tbinom{n}{n}\cdot \tbinom{2n}{n} \cdots \tbinom{dn}{n} = \frac{n!}{(n-n)! n!} \cdot \frac{(2n)!}{(2n-n)! n!} \cdot \frac{(3n)!}{(3n-n)! n!}\cdots \frac{(nd)!}{(nd-n)! n!} $$This product has a telescopic property - thus we can cancel out elements and get:
$$cnt = \frac{(nd)!}{(n!)^d} = \frac{1\cdot 2 \cdot 3 \cdots \cdot nd}{(1 \cdot 2 \cdots n) \cdots(1 \cdot 2 \cdots n)} $$Break this product into two terms:
$$cnt = \frac{1 \cdots 2n}{n! \cdot n!} \cdot \frac{(2n + 1) \cdots nd }{(n!)^{d - 2}}$$The first multiplicand is clearly larger than $1$, as for the second, each term in the numerator is at least $2n$ and each term in the denominator is at most $n$ thus clearly: $$cnt \gg \frac{2n \cdot 2n \cdots 2n}{n \cdot n \cdot n} = 2^{dn - 2n} = exp(n,d)$$
Conclusion? Using the combinatorial computation we reduce a running time which is exponential in $n,d$ into a running time polynomial in $n,d$.
The number of sets of size $k$ selected from a set of $n$ elements (with no repetitions) Recursive formula (Pascal): $\binom{n}{k} = \binom{n-1}{k} + \binom{n-1}{k-1}$ where the stopping criterion is $\binom{n}{0} = \binom{n}{n} = 1$
The time complexity of binom is exponential in $n$ (worst case behaviour is when $k=\frac{n}{2}$)
def binom(n,k):
if n < 0 or k < 0 or n < k:
return 0
elif (k==0 or n==k):
return 1
return binom(n-1,k-1) + binom(n-1,k)
binom(4,2)
6
def binom_trace(n,k):
result = binom_trace(n,k)
return result
def binom_trace(n,k,indent=1):
#indent = how much to indent the printouts
if (k<0 or n<0 or n<k): # safety checks
return 0
elif (k==0 or k==n): # halting conditions
print(">>"*indent + "({},{})".format(n,k))
print("<<"*indent + "({},{})".format(n,k))
return 1
print(">>"*indent + "({},{})".format(n,k))
indent+=1
val = binom_trace(n-1,k,indent) + binom_trace(n-1,k-1,indent)
indent-=1
print("<<"*indent + "({},{})".format(n,k))
return val
binom_trace(4,2)
>>(4,2) >>>>(3,2) >>>>>>(2,2) <<<<<<(2,2) >>>>>>(2,1) >>>>>>>>(1,1) <<<<<<<<(1,1) >>>>>>>>(1,0) <<<<<<<<(1,0) <<<<<<(2,1) <<<<(3,2) >>>>(3,1) >>>>>>(2,1) >>>>>>>>(1,1) <<<<<<<<(1,1) >>>>>>>>(1,0) <<<<<<<<(1,0) <<<<<<(2,1) >>>>>>(2,0) <<<<<<(2,0) <<<<(3,1) <<(4,2)
6
Now with memoization:
def binom_fast(n,k):
d = {}
return binom_mem(n,k,d)
def binom_mem(n,k,mem):
if n < 0 or k < 0 or n < k:
return 0
elif (k==0 or n==k):
return 1
if (n,k) not in mem:
mem[(n,k)] = binom_mem(n-1,k, mem) + \
binom_mem(n-1,k-1, mem)
return mem[(n,k)]
binom_fast(4,2)
binom_fast(50,25)
6
126410606437752
Printing the recursive calls, with memoization:
def binom_fast_trace(n,k):
mem = dict()
result = binom_mem_trace(n,k,mem)
return result
def binom_mem_trace(n,k,mem,indent=1):
#indent = how much to indent the printouts
if (k<0 or n<0 or n<k): # safety checks
return 0
elif (k==0 or k==n): # halting conditions
print(">>"*indent + "({},{})".format(n,k))
print("<<"*indent + "({},{})".format(n,k))
return 1
print(">>"*indent + "({},{})".format(n,k))
indent+=1
if (n,k) not in mem:
mem[(n,k)] = binom_mem_trace(n-1,k,mem,indent) + binom_mem_trace(n-1,k-1,mem,indent)
indent-=1
print("<<"*indent + "({},{})".format(n,k))
return mem[(n,k)]
binom_fast_trace(4,2)
>>(4,2) >>>>(3,2) >>>>>>(2,2) <<<<<<(2,2) >>>>>>(2,1) >>>>>>>>(1,1) <<<<<<<<(1,1) >>>>>>>>(1,0) <<<<<<<<(1,0) <<<<<<(2,1) <<<<(3,2) >>>>(3,1) >>>>>>(2,1) <<<<<<(2,1) >>>>>>(2,0) <<<<<<(2,0) <<<<(3,1) <<(4,2)
6
To analyze the time complexity of the function, we will construct an $ (n+1) \times (k+1)$ table, where the cell in position $(i,j)$ will denote a call to compute $\binom{i}{j}$.
In this method, the running time can be computed by $$\text{number of visited cells} \times \text{number of visits per cell} \times \text{time per cell (without recursive calls)}$$
Consider a cell in position $(i,j)$. By our recursive formula, this cell will be called exactly in the cases where we need to compute either $(i + 1, j)$ or $(i + 1, j + 1)$.
Now, assume $(i+1, j)$ was the first cell to call $(i,j)$, then:
It follows that each cell will be accessed at most twice, and thus the running time is clearly $O(nk)$. In the diagram below we see that we can even give a more precise running time based on the fact that many cells in the table are base cases.
A bus driver needs to give an exact change and she has coins of limited types. She has infinite coins of each type. Given the amount of change ($amount$) and the coin types (the list $coins$), how many ways are there?
def change(amount, coins):
if amount == 0:
return 1
elif amount < 0 or coins == []:
return 0
return change(amount, coins[:-1]) +\
change(amount - coins[-1], coins)
change(5, [1,2,3])
5
Now with memoization:
def change_fast(amount, coins):
d = {}
return change_mem(amount, coins, d)
def change_mem(amount, coins, d):
if amount == 0:
return 1
elif amount < 0 or coins == []:
return 0
#if (amount, tuple(coins)) not in d:
if (amount, len(coins)) not in d:
#d[(amount, tuple(coins))] = \
d[(amount, len(coins))] = \
change_mem(amount, coins[:-1], d) +\
change_mem(amount - coins[-1], coins, d)
#return d[(amount, tuple(coins))]
return d[(amount, len(coins))]
change_fast(500, [1,3,2])
21084