Part 4: Dictionaries, factorization, and multiplicative functions

The list type is perfect for keeping track of ordered data. Here we introduce the dict (dictionary) type, which can be used for key-value pairs. This data structure is well suited for storing the prime decomposition of integers, in which each prime (key) is given an exponent (value). As we introduce the dict type, we also discuss some broader issues of objects and methods in Python programming.

We apply these programming concepts to prime decomposition and multiplicative functions (e.g., the divisor sum function). This material accompanies Chapter 2 of An Illustrated Theory of Numbers.

Dictionaries and factorization

Lists, dictionaries, objects and methods

Lists, like [2,3,5,7] are data structures built for sequentially ordered data. The items of a list (in this case, the numbers 2,3,5,7) are indexed by natural numbers (in this case, the indices are 0,1,2,3). Python allows you to access the items of a list through their index.

In [ ]:
L = [2,3,5,7]
type(L)
In [ ]:
print L[0]  # What is the output?
In [ ]:
print L[3]  # What is the output?
In [ ]:
print L[5]  # This should give an IndexError.

Python dictionaries are structures built for data that have a key-value structure. The keys are like indices. But instead of numerical indices (0,1,2,etc.), the keys can be any numbers or strings (technically, any immutable type)! Each key in the dictionary references a value, in the same way that each index of a list references an item. The syntax for defining a dictionary is {key1:value1, key2:value2, key3:value3, ...}. A first example is below. You can also read the official tutorial for more on dictionaries.

In [ ]:
nemo = {'species':'clownfish', 'color':'orange', 'age':6}
In [ ]:
nemo['color']  # The key 'color' references the value 'orange'.
In [ ]:
nemo['age']  # Predict the result.
In [ ]:
nemo[1]  # This yields a KeyError, since 1 is not a key in the dictionary.

Dictionaries can have values of any type, and their keys can be numbers or strings. In this case, the keys are all strings, while the values include strings and integers. In this way, dictionaries are useful for storing properties of different kinds -- they can be used to store records, as they are called in other programming languages.

An interlude on Python objects

We have discussed how Python stores data of various types: int, bool, str, list, dict, among others. But now seems like a good time to discuss the fundamental "units" which are stored: these are called Python objects. If you have executed the cells above, Python is currently storing a lot of objects in your computer's memory. These objects include nemo and L. Also L[0] is an object and nemo['age'] is an object. Each of these objects are occupying a little space in memory.

We reference these objects by the names we created, like nemo and L. But for internal purposes, Python assigns every object a unique ID number. You can see an object's ID number with the id function.

In [ ]:
id(L)
In [ ]:
id(nemo)
In [ ]:
id(L[0])

It is sometimes useful to check the ID numbers of objects, to look "under the hood" a bit. For example, consider the following.

In [ ]:
x = 3
y = 3
print x == y  # This should be true!
In [ ]:
id(x)
In [ ]:
id(y)

What happened? You probably noticed that both variables x and y have the same id number. That means that Python is being efficient, and not filling up two different slots of memory with the same number (3). Instead, it puts the number in one memory slot, and uses x and y as alternative names for this slot.

But what happens if we change a value of one variable?

In [ ]:
x = 5
In [ ]:
id(x)
In [ ]:
id(y)

Python won't be confused by this. When we assigned x = 5, Python opened up a new memory slot for the number 5, and assigned x to refer to the number in this new slot. Note that y still "points" at the old slot. Python tries to be smart about memory, remembering where numbers are stored, and putting numbers into slots "under the hood" as it sees fit.

In [ ]:
id(3)  # Does Python remember where it put 3?
In [ ]:
id(5)  # Does Python remember where it put 5?
In [ ]:
id(4)  #  4 is probably not in memory before.  But now it is!
In [ ]:
y = 5
In [ ]:
id(y)  #  Did Python change the number in a slot?  Or did it point `y` at another slot?
In [ ]:
id(L[2]) #  Python doesn't like to waste space.

This sort of memory management can be helpful to avoid repetetion. For example, consider a list with repetition.

In [ ]:
R = [19,19,19]
In [ ]:
id(R)  # The list itself is an object.
In [ ]:
id(R[0])  # The 0th item in the list is an object.
In [ ]:
id(R[1])  # The 1st item in the list is an object.
In [ ]:
id(R[2])  # The 2nd item in the list is an object.

By having each list entry point to the same location in memory, Python avoids having to fill three blocks of memory with the same number 19.

Python objects can have methods attached to them. Methods are functions which can utilize and change the data within an object. The basic syntax for using methods is <object>.<method>(). Here are two examples to get started: The keys and values of a dictionary can be recovered using the keys() and values() methods.

In [ ]:
nemo.keys()  # What are the keys of nemo?
In [ ]:
nemo.values()  # What are the values of nemo?

The output of the keys() and values() methods are lists. As such, they are convenient for iteration and membership-testing.

In [ ]:
'color' in nemo.keys()
In [ ]:
'taste' in nemo.keys()
In [ ]:
'orange' in nemo.keys()  # Is 'orange' a key in the dictionary?
In [ ]:
for k in nemo.keys():  # Iterates through the keys.
    print 'Nemo\'s',k,'is',nemo[k]  # \' is used to get a single-quote in a string.

In fact, Python provides a simpler syntax for iterating over keys or testing membership in keys. The syntax for k in <dictionary>: iterates the variable k through the keys of the <dictionary>. Similarly the syntax k in <dictionary> is shorthand for k in <dictionary>.keys().

In [ ]:
for k in nemo:  # This will iterate through the *keys* of the dictionary nemo.
    print 'Nemo\'s',k,'is',nemo[k]

Sometimes we'll want to change a dictionary. Perhaps we learn that nemo has gotten lost.

In [ ]:
nemo['status'] = 'lost'
In [ ]:
id(nemo)
In [ ]:
id('status')
In [ ]:
id(nemo.keys()[1])  # Can you understand the result of this?
In [ ]:
print nemo

The command nemo['status'] = 'lost' creates a new key in the dictionary called 'status' and assigns the value 'lost' to the key. If we find nemo, then we can change the value.

In [ ]:
nemo['status'] = 'found'
print nemo

Since 'status' is already among the keys of nemo, the command nemo['status'] = 'found' does not create a new key this time. It just changes the associated value from 'lost' to 'found'.

In [ ]:
nemo.keys()  # What are the keys of nemo now?
In [ ]:
nemo.values()  # What are the values of nemo now?

We mentioned earlier that keys() and values() are methods attached to the object nemo, and methods are functions which are attached to Python objects.

Python objects often (and often by default!) have methods attached to them. Every dictionary and every list in Python comes with attached methods. Methods can be used to extract properties of objects or change them. Here are examples of some list methods.

In [ ]:
L = [2,3,5,7]
print L # Let's remember what the list L is.
In [ ]:
print L[0] # What is this?
In [ ]:
id(L[0]) # What is the ID number of the 0th item in the list?
In [ ]:
L.reverse() # The reverse() method changes L!
print L
In [ ]:
L[0]  # We have definitely changed L.
In [ ]:
L[3]  # The last item in the list L.
In [ ]:
id(L[3]) # The ID number of the last item in the list L.

Observe that Python changed the order of the items in the list. But it didn't move them around in memory! The object 2 maintains the same ID number, and stays in the same place in memory. But the list item L[0] points at 2 before reversing while L[3] points at 2 after reversing. This kind of thing is confusing at first, but the general framework is <variable> points at <memory location>. You choose the name of the variable and work with the variable directly. Python labels each memory location with an ID number, and puts stuff in memory and retrieves values from memory according to your wishes.

In [ ]:
L.append(11) # Let's add another term to the list with the append(*) method.
print L
In [ ]:
L.sort()  # Let's get this list back in order.
print L

Some more useful list methods can be found at the official Python tutorial.

Prime decomposition dictionaries

If $N$ is a positive integer, then $N$ can be uniquely decomposed into a product of primes. Here "uniquely" means that $N$ has a unique expression of the form $$N = 2^{e_2} 3^{e_3} 5^{e_5} \cdots$$ in which theexponents $e_2, e_3, e_5$, etc., are natural numbers (and only finitely many are nonzero).

A Python dictionary is well-suited to store the resulting prime decomposition. For example, we might store the prime decomposition $2^3 3^2 7$ with the dictionary {2:3, 3:2, 7:1}. The prime numbers which occur in the decomposition become the keys of the dictionary, and the natural number exponents becomes the values of the dictionary.

The functions below decompose a positive integer N into primes, storing the result in a dictionary. The strategy is to repeatedly strip off (divide by) the smallest prime factor of a number, adjusting the dictionary along the way, until the number is reduced to 1. The first function below finds the smallest prime factor of a number.

In [ ]:
from math import sqrt  # We'll want to use the square root.

def smallest_factor(n):
    '''
    Gives the smallest prime factor of n.
    '''
    if n < 2:
        return None # No prime factors!
    
    test_factor = 2 # The smallest possible prime factor.
    max_factor = sqrt(n) # we don't have to search past sqrt(n).
    
    while test_factor <= max_factor:
        if n%test_factor == 0:
            return test_factor
        test_factor = test_factor + 1  # This could be sped up.
    
    return n  # If we didn't find a factor up to sqrt(n), n itself is prime!
        
In [ ]:
smallest_factor(105)
In [ ]:
smallest_factor(1999**2)  # 1999 might be called the Prince of primes.
In [ ]:
smallest_factor(11**3 * 13**9) # The result should be 11.
In [ ]:
def decompose(N):
    '''
    Gives the unique prime decomposition of a positive integer N,
    as a dictionary with primes as keys and exponents as values.
    '''
    current_number = N  # We'll divide out factors from current_number until we get 1.
    decomp = {} # An empty dictionary to start.
    while current_number > 1:
        p = smallest_factor(current_number) # The smallest prime factor of the current number.
        if p in decomp.keys():  # Is p already in the list of keys?
            decomp[p] = decomp[p] + 1 # Increase the exponent (value with key p) by 1.
        else:  # "else" here means "if p is not in decomp.keys()".
            decomp[p] = 1  # Creates a new entry in the dictionary, with key p and value 1.
        current_number = current_number / p # Factor out p.
    return decomp
In [ ]:
decompose(100) # What is the prime decomposition of 100?
In [ ]:
decompose(56401910421778813463) # This should be quick.
In [ ]:
#  Use this space to experiment a bit with the decompose function.

Now that we have a function to compute the prime decomposition of a positive integer, we write a function to recover a positive integer from such a prime decomposition. The function is deceptively simple, since Python makes it easy to iterate through the keys of a dictionary. Make sure that you understand every line.

In [ ]:
def recompose(D):
    '''
    If D is a dictionary with prime keys and natural values,
    this function outputs the product of terms of the form
    key^value.  In this way, it recovers a single number from a
    prime decomposition.
    '''
    N = 1
    for p in D.keys():  # iterate p through all the keys of D.
        N = N * (p ** D[p])  # Note that D[p] refers to the value (exponent) for the key p.
    return N
In [ ]:
D = decompose(1000)
print D
In [ ]:
recompose(D)  # This should recover 1000.
In [ ]:
recompose({2:1, 3:1, 5:1, 7:1})  # What will this give?
In [ ]:
# Use this space to experiment with decompose and recompose.

Exercises

  1. Create the list [1,100,2,99,3,98,4,97,...,50,51] with as few list commands as you can.

  2. If you try the commands x = 7, y = 11, then x,y = y,x, what do you expect happens with id(x) and id(y) along the way?

  3. How might you adapt the decompose function to work with all integers (positive and negative)? Note that zero does not have a prime decomposition, but negative numbers have an associated sign.

  4. Write a function multiply(A,B), in which the parameters A and B are prime decomposition dictionaries and the output is the prime decomposition of their product.

  5. Write a function divides(A,B), in which the parameters A and B are prime decomposition dictionaries and the output is a boolean: True if A divides B and false otherwise.

  6. The radical of a positive integer N is the positive integer whose prime factors are the same as N, but in which every prime occurs with exponent 1. For example, $rad(500) = 2 \cdot 5 = 10$. Write a function radical(N) which computes the radical of N. You can use the decompose(N) function along the way.

In [ ]:
# Use this space for the exercises.

Multiplicative functions

A multiplicative function is a function $f(n)$ which takes positive integer input $n$, and which satisfies $f(1) = 1$ and $f(ab) = f(a) f(b)$ whenever $a$ and $b$ are coprime. A good example is the divisor-sum function, implemented below.

In [ ]:
def divisor_sum(n):
    S = 0 # Start the sum at zero.
    for d in range(1,n+1):  # potential divisors between 1 and n.
        if n%d == 0:
            S = S + d
    return S
In [ ]:
divisor_sum(100)  # The sum 1 + 2 + 4 + 5 + 10 + 20 + 25 + 50 + 100
In [ ]:
%timeit divisor_sum(730) # Let's see how quickly this runs.

A perfect number is a positive integer which equals the sum of its proper factors (its positive factors, not including itself). Thus a number $n$ is perfect if its divisor sum equals $2n$. This can be implemented in a very short function.

In [ ]:
def is_perfect(n):
    return divisor_sum(n) == 2*n
In [ ]:
is_perfect(10)
In [ ]:
is_perfect(28)

Let's find the perfect numbers up to 10000. It might take a few seconds.

In [ ]:
for j in range(1,10000):
    if is_perfect(j):
        print "%d is perfect!"%(j)

Multiplicative functions like the divisor sum function can be computed via prime decomposition. Indeed, if $f$ is a multiplicative function, and $$n = 2^{e_2} 3^{e_3} 5^{e_5} \cdots,$$ then the value $f(n)$ satisfies $$f(n) = f(2^{e_2}) \cdot f(3^{e_3}) \cdot f(5^{e_5}) \cdots.$$

So if we can compute the values of $f$ on prime powers, we can compute the values of $f$ for all positive integers.

The following function computes the divisor sum function, for a prime power $p^e$.

In [ ]:
def divisor_sum_pp(p,e):  # pp stands for prime power
    '''
    Computes the divisor sum of the prime power p**e,
    when p is prime and e is a positive integer.
    This is just 1 + p^2 + p^3 + ... + p^e,
    simplified using a geometric series formula.
    '''
    return (p**(e+1) - 1) / (p - 1)
In [ ]:
divisor_sum_pp(2,3)  # Should equal 1 + 2 + 4 + 8
In [ ]:
divisor_sum_pp(3,1)  # Should equal 1 + 3

Now let's re-implement the divisor sum function, using prime decomposition and the divisor_sum_pp function for prime powers.

In [ ]:
def divisor_sum(n):
    '''
    Computes the sum of the positive divisors of a 
    positive integer n.
    '''
    D = decompose(n)  # We require the decompose function from before!
    result = 1
    for p in D.keys():
        result = result * divisor_sum_pp(p,D[p])
    return result
    
In [ ]:
divisor_sum(15)
In [ ]:
% timeit(divisor_sum(730)) # this probably runs faster than the previous version.

There are a lot of interesting multiplicative functions. We could implement each one by a two-step process as above: implementing the function for prime powers, then defining a version for positive integers by using the decompose function. But there's a shortcut for the second step, which brings in a very cool aspect of Python.

In Python, functions are Python objects.

In [ ]:
type(divisor_sum_pp) # Every object has a type.
In [ ]:
id(divisor_sum_pp) # Yes, every object gets an ID number.

Since functions are Python objects, it is possible to define a function which takes a function as input and outputs a function too! You can pass a function as an input parameter to another function, just as if it were any other variable. You can output a function with the return keyword, just as if it were another variable. And you can define a new function within the scope of a function!

Here's a basic example as a warmup.

In [ ]:
def addone(x):  # Let's make a simple function.
    return x + 1  # It's not a very interesting function, is it.
In [ ]:
addone(10) # Predict the result.
In [ ]:
def do_twice(f):
    '''
    If a function f is input, then the output is the function
    "f composed with f."
    '''
    def ff(x):  # Defines a new function ff!
        return f(f(x)) # This is what ff does.
    
    return ff
In [ ]:
addtwo = do_twice(addone)  # addtwo is a function!
In [ ]:
addtwo(10)  # What is the result?

Now we exploit this function-as-object approach to create a Python function called mult_function. Given a function f_pp(p,e), the function mult_function outputs the multiplicative function which coincides with f_pp on prime powers. In other words, if f = mult_function(f_pp), then f(p**e) will equal f_pp(p,e).

In [ ]:
def mult_function(f_pp):
    '''
    When a function f_pp(p,e) of two arguments is input,
    this outputs a multiplicative function obtained from f_pp
    via prime decomposition.
    '''
    def f(n):
        D = decompose(n)
        result = 1
        for p in D:
            result = result * f_pp(p, D[p])
        return result
    
    return f

Let's see how this works for the divisor-counting function. This is the function $\sigma_0(n)$ whose value is the number of positive divisors of $n$. For prime powers, it is easy to count divisors, $$\sigma_0(p^e) = e + 1.$$

In [ ]:
def sigma0_pp(p,e):
    return e+1

Since the divisor-counting function is multiplicative, we can implement it by applying mult_function to sigma0_pp.

In [ ]:
sigma0 = mult_function(sigma0_pp)
In [ ]:
sigma0(100)  # How many divisors does 100 have?

Exercises

  1. A positive integer $n$ is called deficient/perfect/abundant according to whether the sum of its proper divisors is less than/equal to/greater than $n$ itself. Among the numbers up to 10000, how many are deficient, perfect, and abundant?

  2. If $f(n)$ is a function with natural number input and real output, define $F(n)$ to be the function given by the formula $F(n) = \sum_{i=0}^n f(i)$. Create a function sumfun(f) which takes as input a function f and outputs the function F as described above.

  3. Consider the function $f(n)$ which counts the number of positive divisors of $n$ which are not divisible by 4. Verify that this is a multiplicative function, and implement it using mult_function.

  4. Write a function foursquare(n) which counts the number of ways that a positive integer n can be expressed as a*a + b*b + c*c + d*d for integers a, b, c, d. Hint: loop the variables through integers between $-\sqrt{n}$ and $\sqrt{n}$. Compare the values of foursquare(n) to the multiplicative function in the previous problem.

  5. A positive integer is "square-free" if it has no square factors besides 1. The Mobius function $\mu(n)$ is defined by $\mu(n) = 0$ if $n$ is not square-free, and otherwise $\mu(n) = 1$ or $\mu(n) = -1$ according to whether $n$ has an even or odd number of prime factors. Verify that the Mobius function is multiplicative and implement it. Try to reproduce the graph of the Mertens function $M(n)$ as described at Wikipedia's article on the Mertens conjecture. (See the previous Python notebook for an introduction to matplotlib for creating graphs.)

In [ ]:
#  Use this space to work on the exercises.