Weekly exercise 5: Improving Sieves of Eratosthenes algorithm

Weekly exercise 5: Improving Sieves of Eratosthenes algorithm¶

Recall the following classic algorithm for finding all prime numbers up to a given upper boundary.

Algorithm:

initialize the list with all integers (1 to the given upper boundary)
go through a list of divisors
cross all numbers on the first list which are divisible by the considered divisor and not equal to it
stop when all divisors (up to the given boundary) are considered, and so all divisible numbers are crossed out, leaving only primes

Task 1¶

Start by implementing the following version of the algorithm, using the starter code below.

In [ ]:

def eratosthenes(upper, verbosity = @@@):
    """Return the list of primes up the given upper bound using the sieve of Eratosthenes algorithm"""
    primes = list(range(1,@@@)) # all numbers between 1 and upper
    divisor = @@@ # initial divisor
    while True: # loop until break is invoked
        divisor += 1 # next divisor
        if divisor > upper: # checked all divisors up to upper
            @@@
        i=0 # check for divisibility starting from the first element of remaining list
        while i < len(primes): # until the end of the list (which changes length inside the loop)
            if primes[i] @@@ divisor and primes[i] @@@ divisor == 0:
                # remove divisible, except the divisor itself
                primes.remove(primes[i]) # remove, next is with the same index
            else:
                i += 1 # skip to go to next element
        if @@@: print("divisor %d:"%divisor,primes)
    if @@@: print("Primes up to %d are:"%upper,primes)
    return @@@

To make sure the function is running as it is supposed to, run the tests below and confirm that the output is as expected.

In [ ]:

eratosthenes(25,verbosity=True) #should print the steps of the algorithm

In [ ]:

x = eratosthenes(5)
print(x) #should print [2,3,5]

In [ ]:

x = eratosthenes(2500)
print('Number of primes up to 2500 is ',len(x)) #should print "Number of primes up to 2500 is 368"

Task 2¶

Use the code snippet from lecture video 9 to collect data on the runtime of the eratosthenes() as a function of the upper bound. Compute 25 runtime data points using the following sequence for the upper bound: 5, 10, 15, etc, and using 1000 functional calls to compute the average.

Make a conjecture on the computational complexity of this algorithm.

In [ ]:

import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = [12, 8]

N = @@@
n,x = [0,] @@@ N, [0,] @@@ N  # lists to accumulate the data
for i,k in [(i,@@@) for i in range(N)]:
    n[i] = k
    t = @@@
    x[i] = @@@

plt.plot(n,x)
plt.xlabel('upper bound')
plt.ylabel('average run time, sec')
plt.show()

Task 3¶

Copy your code from above and make the following improvements to the algorithm:

Force divisors only be prime numbers themselves
When checking for divisibility, start from the square of the divisor instead of checking each one (why can we do it this way?)
Complete the algorithm when the starting point in previous step is above the upper bound

Run the same tests and make sure it the results are identical to the first implementation of the algorithm.

In [ ]:

# Write your code here
#

In [ ]:

eratosthenes(25,verbosity=True) #should print the steps of the algorithm
eratosthenes_better(25,verbosity=True)

In [ ]:

x = eratosthenes(5)
print(x) #should print [2,3,5]
x = eratosthenes_better(5)
print(x) #should print [2,3,5]

In [ ]:

x = eratosthenes(2500)
print('Number of primes up to 2500 is ',len(x)) #should print "Number of primes up to 2500 is 368"
x = eratosthenes_better(2500)
print('Number of primes up to 2500 is ',len(x)) #should print "Number of primes up to 2500 is 368"

Task 4¶

Repeat the run time analysis for the improved version of the algorithm, and plot the graph for the two implementations side by side.

In [ ]:

import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = [12, 8]

N = @@@
n,x,y = @@@  # lists to accumulate the data
for i,k in [(i,@@@) for i in range(N)]:
    n[i] = @@@
    t = @@@
    x[i] = @@@  # run time of original algorithm
    t = @@@
    y[i] = @@@  # run time of improved algorithm

plt.plot(n,x,label='Original algorithm')
plt.plot(n,y,label='Improved algorithm')
plt.xlabel('upper bound')
plt.ylabel('average run time, sec')
plt.legend()
plt.show()

Note, however, that 1 is not a prime! https://www.youtube.com/watch?v=IQofiPqhJ_s