cs1001.py , Tel Aviv University, Spring 2019

Recitation 5

We continued discussing complexity. Then we discussed higher-order, lambda expressions, $m$-channel merge and binary search.

Takeaways:

  1. In order to analyze the time complexity of a code, try to bound the number of "basic operations" performed by your code. If your code contains loops try to understand their structure (series or parallel, and dependent or independent). This may help bounding the overall complexity.
  2. Lambda expressions are a method for writing short functions. Note that they are rather limited as only expressions (which have a value) can appear after ":".
  3. Higher-order functions are quite useful and may lead to a more compact code.

Code for printing several outputs in one cell (not part of the recitation):

In [ ]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

Reminder: Big O notation

Given two functions $f(n)$ and $g(n)$,

$f(n) = O(g(n))$ If and only if there exist $c > 0 $ and $n_{0}\in \mathbb{R}$ such that $\forall n>n_0$
$|f(n)| \leq c\cdot|g(n)|$

Sequential execution of loops

Let $n$ denote the input size. Let $f_1(n) = O(g_1(n))\;$ and $f_2(n)=O(g_2(n))$.

for i in range(f1(n))):
    O(1)
for j in range(f2(n)):
    O(1)

Last week we proved that $f_1(n) + f_2(n) = O(g_1(n) + g_2(n))$ and that $f_1(n) + f_2(n) = O(max(g_1(n), g_2(n)))$

We've also mentioned that $f_1 + f_2 + ... + f_k = O(f_{max})$. That is, in a finite constant sum of functions, the dominate function defines the growth rate. A private case (which we have also proven last week) is that of a polynomial.

Nested execution of loops (independent indices)

for i in range(f1(n)):
    for j in range(f2(n)):
        O(1)

Similar to the previous claim, it holds that $f_1(n) \cdot f_2(n) = O(g_1(n) \cdot g_2(n))$ (prove this yourself!).

Nested execution of loops (dependent indices)

for i in range(n):
    for j in range(f(i)):
        O(1)

Use $\sum$ to bound the time complexity in this case

Two useful examples:

  • In the case where $f(i) = i$ the sum behaves like an arithmetic series: $\sum_{i=1}^{n}{i} = O(n^2)$
  • In the case where $f(i) = 2^i$ the sum behaves like a geometric series: $\sum_{i=1}^{n}{2^i} = O(2^n)$

Exercise: Analyze loops

for i in range(1, n + 1):
    j = 1
    while j <= n:
        j += 1  # O(n**2)

        j += 7  # O(n**2), inner loop does n/7 iterations 
                #   for each outer loop

        j *= 2  # O(n*log(n))

        j *= 7  # O(n*log(n)), change log bases is like 
                #   multiplying by a constant

        j **= 2 # O(n*log(log(n))), Here we assume j starts from 2.
                #   by induction after k steps j = 2^(2^k) 
                #   we need to take a log on both sides *twice*: n = 2^(2^k) => k = log log n

        j += i  # O(n*log(n)), h(n) = the sum of 1/i from i=1 to n is O(log(n))
                #   This is the harmonic series

Lambda expressions and higher-order functions

Expressions:

Anonymous vs. named values

In [ ]:
print(2**10)

x = 2**10
print(x)

Functions:

Lambda expressions can be used for creating anonymous functions (and named ones as well)

In [ ]:
(lambda x: x+2)(3)

plus2 = lambda x : x + 2
plus2(3)

Example: A function that returns a function: make power

In [1]:
def make_pow(n):
    def fixed_pow(x):
        return pow(x, n)
    return fixed_pow

square = make_pow(2)
cube = make_pow(3)

square(10)
cube(10)
make_pow(2)(10)
Out[1]:
100
In [ ]:
def make_pow(n):
    return lambda x : pow(x, n)

square = make_pow(2)
cube = make_pow(3)

square(10)
cube(10)

Example: A function that takes a function as its argument (function as an input): sorted

In [2]:
lst = ["michal", "amirrub", "daniel", "ben"]

sorted(lst)
Out[2]:
['amirrub', 'ben', 'daniel', 'michal']

"sorted" can recieve a function as an argument and use it to sort the input list. The function is given as the "key" argument to "sorted". Note that the "key" is used for ordering the elements without changing them.

examples: sort by length, sort by reverse lexicographical order

In [4]:
sorted(lst, key=lambda x: len(x))
sorted(lst, key=len)

# def rev(s):
#     return s[::-1]

# sorted(lst, key=rev)
# sorted(lst, key=lambda s: s[::-1])
Out[4]:
['ben', 'michal', 'daniel', 'amirrub']

What happens when two elements are equivalent under the key function? Will Python use the "builtin" comparison method? What if it's undefined?

In [7]:
# Here "x" should come before "y" lexicographically
lst1 = ["abc", "123", "x", "y"]
lst2 = ["abc", "123", "y", "x"]

sorted(lst1, key=lambda x: len(x))
sorted(lst2, key=lambda x: len(x))
Out[7]:
['y', 'x', 'abc', '123']

another example: sort by the int value of the string elements

In [ ]:
lst2 = ["232", "11", "3"]
sorted(lst2)
sorted(lst2, key=int)

Example: another function that gets a function as its input

In [ ]:
def sum_naturals(n):
    total = 0
    for k in range(1, n + 1):
        total += k
    return total

sum_naturals(10)
In [ ]:
def sum_squres(n):
    total = 0
    for k in range(1, n + 1):
        total += k**2
    return total

sum_squres(10)

Let's make it a general function that gets another function as a parameter:

In [2]:
def summation(n, func):
    total = 0
    for k in range(1, n + 1):
        total += func(k)
    return total

The equivalent to sum_naturals and sum_squares:

In [3]:
term_naturals = lambda x : x
summation(10, term_naturals)

term_squares = lambda x : x**2
summation(10, term_squares)
Out[3]:
385

Approximating $\pi$:

The following (infinite) sum slowly converges to $\pi$: $\frac{8}{1\cdot 3} + \frac{8}{5\cdot 7} + \frac{8}{9\cdot11} + \ldots$ We use "summation" to compute the sum of the first $n$ elements in this series

In [ ]:
term_pi = lambda k : 8 / ((4*k-1) * (4*k-3))

summation(10, term_pi)
summation(100, term_pi)
summation(10000000, term_pi)

$m$-Channel merge:

Given a list of $m\geq 2$ sorted sublists (ascending) as an input, we would like to return as an output a new single sorted (ascending) list containing all elements in the input sublists. The input list/sublists should not be changed.

For complexity anaysis, we denote by $n$ the maximal length of an input sublist.

We will consider 3 possible solutions (two in class and one in hw3):

In [19]:
def multi_merge_v1(lst_of_lsts):
    all_elems = [e for lst in lst_of_lsts for e in lst]
    merged = []
    while all_elems != []:
        minimum = min(all_elems)
        merged += [minimum]
        all_elems.remove(minimum)
    return merged
In [20]:
multi_merge_v1([[1, 2, 2, 3], [2, 3, 5], [5]])
Out[20]:
[1, 2, 2, 2, 3, 3, 5, 5]

$mn$ iterations, in iteration $i$ we perform operations that take $O(mn-i)$. Overall complexity is $\sum_{i=0}^{mn}{(mn-i)} = \sum_{i=0}^{mn}(i) = O(m^2n^2)$

In [21]:
def merge(A, B):
    """ merging two lists into a sorted list
        A and B must be sorted! """
    n = len(A)
    m = len(B)
    C = [None for i in range(n + m)]

    a = b = c = 0
    while  a < n  and  b < m: #more element in both A and B
        if A[a] < B[b]:
            C[c] = A[a]
            a += 1
        else:
            C[c] = B[b]
            b += 1
        c += 1

    C[c:] = A[a:] + B[b:] #append remaining elements (one of those is empty)

    return C


def multi_merge_v2(lst_of_lsts):
    m = len(lst_of_lsts)
    merged = []
    for lst in lst_of_lsts:
        merged = merge(lst, merged)
    return merged 
In [22]:
multi_merge_v2([[1, 2, 2, 3], [2, 3, 5], [5]])
Out[22]:
[1, 2, 2, 2, 3, 3, 5, 5]

The function does $m$ iterations, in iteration $i$ we merge a list of length at most n with a list of length at most $n\cdot (i-1)$. Merging two lists of length $k,\ell$ takes $O(k + \ell)$. Overall complexity is at most $$\sum_{i=0}^{m} n\cdot(i-1) + n = \sum_{i=0}^{m}{ni} = n\cdot\sum_{i=0}^{m}i = O(nm^2)$$

In [ ]: