Notebook

cs1001.py , Tel Aviv University, Spring 2019

Recitation 8¶

We reviewed some properties of prime numbers and used them for primality testing. We reviewed the Diffie-Hellman protocol for finding a shared secret key and also tried to crack it. Then, we discussed Linked lists.

Takeaways:¶

The probabilistic function is_prime, that uses Fermat's primality test, can be used to detect primes quickly and efficiently, but has a (very small) probability of error. Its time complexity is $O(n^3)$, where $n$ is the number of bits of the input.
The DH protocol relies on two main principles: the following equality $(g^{a}\mod p)^b \mod p = g^{ab} \mod p $ and the (believed) hardness of the discrete log problem (given $g,p$, and $x = g^{a} \mod p$ finding $a$ is hard). Make sure you understand the details of the protocol.
OOP allows us to create classes at will and to define complex objects. Remember that the most important thing is to find a good inner representation of the object so that you will have an easy time working with it.
Important properties of Linked lists:
- Not stored continuously in memory.
- Allows for deletion and insertion after a given element in $O(1)$ time.
- Accessing the $i$'th element takes $O(i)$ time.
- Search takes $O(n)$ time (no random access $\implies$ no $O(\log{n})$ search).

Code for printing several outputs in one cell (not part of the recitation):¶

In [14]:

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

Primes and Diffie-Hellman¶

Fermat's little theorem¶

Fermat's little theorem states that if $p$ is prime and $1 < a < p$, then $a^{p-1} (\textrm{mod}\ p) \equiv 1$

Compositeness witnesses¶

A witness is a piece of evidence that can be produced in order to prove a claim. In our case, the problem we are tackling is deciding whether a given number $m$ is prime or composite. We now describe three types of witnesses for compositeness:

A clear witness for compositeness can be a (not necessarily prime) factor of $m$. That is, a number $1 < b < m$ such that $b ~|~ m$ ($b$ divides $m$). Since $b$ is a non-trivial factor of $m$, $m$ is clearly composite.
Another, less trivial witness of compositeness is a GCD witness, that is a number $1 < b < m$ such that $\mathrm{gcd}(m,b) > 1$. Think, why is $b$ a witness of compositeness? Let $\mathrm{gcd}(m,b) = r$, then by definition $r ~|~ n$. As $r \leq b < m$ we get that $r$ is a factor of $m$ and thus $m$ is composite.
Fermat's primality test defines another set of witness of compositeness - a number $1 < a < m$ is a Fermat witness if $a^{m-1} (\textrm{mod}\ m) \not\equiv 1$

Why go through all of this work? Our tactic would be to randomly draw a number $b \in {1,\ldots, m - 1}$ and hope that $b$ is some witness of compositeness. Clearly, we'd like our witness pool to be as large as possible.

Now, if $m$ is a composite number, let $\mathrm{FACT}_m, \mathrm{GCD}_m, \mathrm{FERM}_m$ be the set of prime factors, GCD witnesses and Fermat witnesses for $m$'s compositeness. It is not hard to show that $$\mathrm{FACT}_m \subseteq \mathrm{GCD}_m \subseteq \mathrm{FERM}_m$$

But the real strength of Fermat's primality test comes from the fact that if $m$ is composite, then apart from very rare cases (where $n$ is a Carmichael number) it holds that $|\mathrm{FERM}_m| \geq m/2$. That is - a random number is a Fermat witness w.p. at least $1/2$.

A side note (for reference only) - Carmichael numbers are exactly the composite numbers $m$ where $\mathrm{GCD}_m = \mathrm{FERM}_m$

Every factor of a composite number is a Fermat's witness¶

Let $m$ be a composite number and write $m = ab$ for some $a,b>1$. We claim that $a$ is a Fermat witness. To see this, assume towards contradiction that $a^{m-1} (\textrm{mod}\ m) \equiv 1$, i.e. $a^{m-1} = c\cdot m + 1= c \cdot a \cdot b + 1$ for some $c \geq 1$.

Rearrange the above to get $a(a^{m-2} - c\cdot b) = 1$. However, $a > 1$ and $(a^{m-2} - c\cdot b) \in \mathbb{Z}$, a contradiction.

Primality test using Fermat's witness¶

We can use Fermat's little theorem in order test whether a given number $m$ is prime. That is, we can test whether we find a Fermat's witness $a\in\mathrm{FERM}_m$ for compositeness. Note that if the number has $n$ bits than testing all possible $a$-s will require $O(2^n)$ iterations (a lot!).

Instead, we will try 100 random $a$-s in the range and see if one works as a Fermat's witness.

In [24]:

import random

def is_prime(m, show_witness=False):

    """ probabilistic test for m's compositeness """''

    for i in range(0, 100):
        a = random.randint(1, m - 1) # a is a random integer in [1..m-1]
        if pow(a, m - 1, m) != 1:
            if show_witness:  # caller wishes to see a witness
                print(m, "is composite", "\n", a, "is a witness, i=", i + 1)
            return False

    return True

For $a,b,c$ of at most $n$ bits each, time complexity of modpower is $O(n^3)$

In [16]:

def modpower(a, b, c):
    """ computes a**b modulo c, using iterated squaring """
    result = 1
    while b > 0: # while b is nonzero
        if b % 2 == 1: # b is odd
            result = (result * a) % c
        a = (a*a) % c
        b = b // 2
    return result

Runtime analysis:¶

The main loop runs over $b$, dividing $b \to b/2$ at each iteration, so it runs $O(n)$ times.
In each iteration we do:
- One operation of $b~\%~2$ in $O(1)$ time (check least significant bit of $b$)
- One operation of $b~//~2$ in $O(1)$ time (snip $b$'s least significant bit)
- At most two multiplication and two modulu operations
- Multiplication of two $n$ bit numbers runs in time $O(n^2)$
- Modulu can be implemented by addition, division and multiplication: $a \textrm{ mod } b = a - (a // b) b$ and division runs in time $O(n^2)$ same as multiplication
- Finally, the modulu operation keeps all numbers at most $n$ bits, thus the running time does not increase with each iteration
In total - $O(n^3)$

The probability of error:¶

First, notice that if the function says that an input number $m$ is not prime, then it is true. Stated the other way around, if $m$ is prime, the function is always right.

The function can make a mistake only in the case where a number $m$ is not prime, and is accidentally categorized by the function as prime. This can happen if all $100$ $a$'s that the function had drawn were not witnesses.

A quick computation shows that if $m$ is not a Charmichael number then at least $\frac{1}{2}$ of all possible $a$s are witnesses, so in almost all cases the probability for error is $(\frac{1}{2})^{100}$ (which is negligible).

Testing the prime number theorem: For a large n, a number of n bits is prime with a prob. of O(1/n)¶

We decide on the size of the sample (to avoid testing all possible $2^{n-1}$ numbers of $n$ bits) and test whether each number we sample is prime. Then we divide the number of primes with the size of the sample.

In [15]:

def prob_prime(n, sample):
    cnt = 0
    for i in range(sample):
        m = random.randint(2**(n-1), 2**n - 1)
        cnt += is_prime(m)
    return cnt / sample

In [69]:

prob_prime(2, 10**4)
prob_prime(3, 10**4)

Out[69]:

1.0

Out[69]:

0.5057

In [70]:

prob_prime(100, 10**4)

Out[70]:

0.0148

In [71]:

prob_prime(200, 10**4)

Out[71]:

0.0081

Diffie Hellman from lecture

The protocol as code¶

In [23]:

def DH_exchange(p):
    """ generates a shared DH key """
    g = random.randint(1, p - 1)
    a = random.randint(1, p - 1) #Alice's  secret
    b = random.randint(1, p - 1) #Bob's  secret
    x = pow(g, a, p)
    y = pow(g, b, p)
    key_A = pow(y, a, p)
    key_B = pow(x, b, p)
    #the next line is different from lecture
    return g, a, b, x, y, key_A #key_A=key_B

Find a prime number¶

In [21]:

def find_prime(n):
    """ find random n-bit long prime """
    while(True):
        candidate = random.randrange(2**(n-1), 2**n)
        if is_prime(candidate):
            return candidate

Demostration:

In [98]:

import random
p = find_prime(10)
print(p)
g, a, b, x, y, key = DH_exchange(p)
g, a, b, x, y, key

Out[98]:

(247, 88, 353, 367, 2, 206)

In [99]:

print(pow(g, a, p))
print(pow(x, b, p))

367
206

Crack the Diffie Hellman code¶

There is no known way to find $a$ efficiently, so we try the naive one: iterating over all $a$-s and cheking whether the equation $g^a \mod p = x$ holds for them.

If we found $a'$ that satisfies the condition but is not the original $a$, does it matter?

The time complexity of crack_DH is $O(2^nn^3)$

In [17]:

def crack_DH(p, g, x):
    ''' find secret "a" that satisfies g**a%p == x
        Not feasible for large p '''
    for a in range(1, p - 1):
        if a % 100000 == 0:
            print(a) #just to estimate running time
        if pow(g, a, p) == x:
            return a
    return None #should never get here

In [100]:

#g, a, b, x, y, key = DH_exchange(p)
print(a)
crack_DH(p, g, x)

Out[100]:

Cracking DH with different values of $a$ (different private keys)¶

The algorithm crack_DH can return a different private key ($a$) than the one chosen by Alice, i.e. - $crack\_DH(p,g,x) = a' \neq a$, however, in this case we have, by definition of the cracking algorithm: $g^{a'} \textrm{ mod } p = g^{a} \textrm{ mod } p$, thus:

$$y^{a'} \textrm{ mod } p = \left(g^{b}\right)^{a'} \textrm{ mod } p = \left(g^{a'}\right)^{b} \textrm{ mod } p = \left(g^{a} \textrm{ mod } p \right)^{b} \textrm{ mod } p = g^{ab} \textrm{ mod } p$$

I.e. - we can still compute the shared secret!

Trying to crack the protocol with a 100 bit prime¶

In [26]:

import random
p = find_prime(100)
print(p)
g, a, b, x, y, key = DH_exchange(p)
print(g, a, b, x, y, key)

crack_DH(p, g, x)

870752849128025769690613494727
556076925348494470910111020503 563975178376164531572554628719 591095820882694542257093278496 683629518338524578223612745129 622210352872361792591098681438 318844905364032174750094749602
100000
200000
300000
400000
500000
600000

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-26-6b3d13779c00> in <module>
      5 print(g, a, b, x, y, key)
      6 
----> 7 crack_DH(p, g, x)

<ipython-input-17-9a63fe3af2e8> in crack_DH(p, g, x)
      3         Not feasible for large p '''
      4     for a in range(1, p - 1):
----> 5         if a % 100000 == 0:
      6             print(a) #just to estimate running time
      7         if pow(g, a, p) == x:

KeyboardInterrupt:

Analyzing the nubmer of years it will take to crack the protocol if $a$ is found at the end (assuming iterating over 100000 $a$s takes a second)

In [27]:

Out[27]:

563975178376164531572554628719

In [28]:

a//100000/60/60/24/365

Out[28]:

1.7883535590314707e+17

Linked Lists¶

A linked list is a linear data structure (i.e. - nodes can be accessed one after the other). The list is composed of nodes, where each node contains a value and a "pointer" to the next node in the list.

Linked lists support operations such as insertion, deletion, search and many more.

In [8]:

class Node():
    def __init__(self, val):
        self.value = val
        self.next = None
        
    def __repr__(self):
        #return  str(self.value)
        
        #for today's recitation, we print the id of self as well
        return str(self.value) + "(" + str(id(self))+ ")"



class Linked_list():

    def __init__(self, seq=None):
        self.next = None
        self.len = 0
        if seq != None:
            for x in seq[::-1]:
                self.add_at_start(x)
            

    def __repr__(self):
        out = ""
        p = self.next
        while  p != None :
            out += str(p) + ", " # str(p) envokes __repr__ of class Node
            p = p.next
        return "[" + out[:-2] + "]"

            
    def __len__(self):
        ''' called when using Python's len() '''
        return self.len
            
            
    def add_at_start(self, val):
        ''' add node with value val at the list head '''
        p = self
        tmp = p.next
        p.next = Node(val)
        p.next.next = tmp
        self.len += 1
    
    def add_at_end(self, val):
        ''' add node with value val at the list tail '''
        p = self
        while (p.next != None):
            p = p.next
        p.next = Node(val)
        self.len += 1
            
        
    def insert(self, loc, val):
        ''' add node with value val after location 0<=loc<len of the list '''
        assert 0 <= loc < len(self)
        p = self
        for i in range(0, loc):
            p = p.next
        tmp = p.next
        p.next = Node(val)
        p.next.next = tmp
        self.len += 1
        
          
    def __getitem__(self, loc):
        ''' called when using L[i] for reading
            return node at location 0<=loc<len '''
        assert 0 <= loc < len(self)
        p = self.next
        for i in range(0, loc):
            p = p.next
        return p

    def __setitem__(self, loc, val):
        ''' called when using L[loc]=val for writing
            assigns val to node at location 0<=loc<len '''
        assert 0 <= loc < len(self)
        p = self.next
        for i in range(0, loc):
            p = p.next
        p.value = val
        return None

    
    def find(self, val):
        ''' find (first) node with value val in list '''
        p = self.next
        # loc = 0     # in case we want to return the location
        while p != None:
            if p.value == val:
                return p
            else:
                p = p.next
                #loc=loc+1   # in case we want to return the location
        return None

    def delete(self, loc):
        ''' delete element at location 0<=loc<len '''
        assert 0 <= loc < len(self)
        p = self
        for i in range(0, loc):
            p = p.next
        # p is the element BEFORE loc
        p.next = p.next.next
        self.len -= 1
 
    
    def insert_ordered(self, val):
        ''' assume self is an ordered list,
            insert Node with val in order '''
        p = self
        q = self.next
        while q != None and q.value < val:
            p = q
            q = q.next
        newNode = Node(val)
        p.next = newNode
        newNode.next = q
        self.len += 1

    def find_ordered(self, val):
        ''' assume self is an ordered list,
            find Node with value val '''
        p = self.next
        while p != None and p.value < val:
            p = p.next
        if p != None and p.value == val: 
            return p
        else:
            return None

Converting a string to a list

In [9]:

L= Linked_list("abc")
L

Out[9]:

[a(2165820022568), b(2165820022736), c(2165820022624)]

Memory View¶

Adding elements one by one. Try using Python tutor.

In [10]:

l = Linked_list()
l.add_at_start("a")
l.add_at_start("b")
l

Out[10]:

[b(2165819893856), a(2165819894528)]

A short summary of methods complexity of Linked lists vs. lists:¶

Method	List	Linked list
init	$O(1)$	$O(1)$
init with $k$ items	$O(k)$	$O(k)$
for the rest of the methods we assume that the structure contains $n$ items
add at start	$O(n)$	$O(1)$
add at end	$O(1)$	$O(n)$
$lst[k]$ (get the $k$th item)	$O(1)$	$O(k)$
delete $lst[k]$	$O(n-k)$	$O(k)$

Reversing a linked list in $O(1)$ additional memory. See the code and demo here

In [11]:

class Node():
    def __init__(self, val):
        self.value = val
        self.next = None
        
    def __repr__(self):
        #return str(self.value)
        
        #for today's recitation, we print the id of self as well
        return  str(self.value) + "(" + str(id(self))+ ")"


class Linked_list():

    def __init__(self, seq=None):
        self.next = None
        self.len = 0
        if seq != None:
            for x in seq[::-1]:
                self.add_at_start(x)
            

    def __repr__(self):
        out = ""
        p = self.next
        while  p != None :
            out += str(p) + ", " # str(p) envokes __repr__ of class Node
            p = p.next
        return "[" + out[:-2] + "]"

            
    def __len__(self):
        ''' called when using Python's len() '''
        return self.len
            
            
    def add_at_start(self, val):
        ''' add node with value val at the list head '''
        p = self
        tmp = p.next
        p.next = Node(val)
        p.next.next = tmp
        self.len += 1
    
    def add_at_end(self, val):
        ''' add node with value val at the list tail '''
        p = self
        while (p.next != None):
            p = p.next
        p.next = Node(val)
        self.len += 1
            
        
    def insert(self, loc, val):
        ''' add node with value val after location 0<=loc<len of the list '''
        assert 0 <= loc < len(self)
        p = self
        for i in range(0, loc):
            p = p.next
        tmp = p.next
        p.next = Node(val)
        p.next.next = tmp
        self.len += 1
        
          
    def __getitem__(self, loc):
        ''' called when using L[i] for reading
            return node at location 0<=loc<len '''
        assert 0 <= loc < len(self)
        p = self.next
        for i in range(0, loc):
            p = p.next
        return p

    def __setitem__(self, loc, val):
        ''' called when using L[loc]=val for writing
            assigns val to node at location 0<=loc<len '''
        assert 0 <= loc < len(self)
        p = self.next
        for i in range(0, loc):
            p = p.next
        p.value = val
        return None

    
    def find(self, val):
        ''' find (first) node with value val in list '''
        p = self.next
        # loc = 0     # in case we want to return the location
        while p != None:
            if p.value == val:
                return p
            else:
                p = p.next
                #loc=loc+1   # in case we want to return the location
        return None

    def delete(self, loc):
        ''' delete element at location 0<=loc<len '''
        assert 0 <= loc < len(self)
        p = self
        for i in range(0, loc):
            p = p.next
        # p is the element BEFORE loc
        p.next = p.next.next
        self.len -= 1
 
    
    def insert_ordered(self, val):
        ''' assume self is an ordered list,
            insert Node with val in order '''
        p = self
        q = self.next
        while q != None and q.value < val:
            p = q
            q = q.next
        newNode = Node(val)
        p.next = newNode
        newNode.next = q
        self.len += 1

    def find_ordered(self, val):
        ''' assume self is an ordered list,
            find Node with value val '''
        p = self.next
        while p != None and p.value < val:
            p = p.next
        if p != None and p.value == val: 
            return p
        else:
            return None

    def reverse(self):
        prev = None
        curr = self.next
        while curr != None:
            tmp = curr.next
            curr.next = prev
            prev = curr
            curr = tmp
        self.next = prev
        
ll = Linked_list("abc")
ll
ll.reverse()
ll

Out[11]:

[a(2165820113248), b(2165820113192), c(2165820113136)]

Out[11]:

[c(2165820113136), b(2165820113192), a(2165820113248)]

In [ ]: