Conversion between numerical base systems

by

Andreas Putz

I created ths notebook after a tricky job interview question:

"How would you convert a integer in the base 10 system to the hexadecimal system ?"

So, my formal programming training has been a while, and while this is a textbook question, I have not pondered it for a decade or more. My answer was some tentative looping and some hint that the modulo operator might be helpful. So after wondering about this for a while, I decided to put some thought into it. And here it is.

Note: This is not meant as a rigorous scientific acrticle, but a short treatise on the topic. Divison is a tricky IEEE operation for non-symbolic calculations. There are a lot of pitfalls for a naive implementation.

In [1]:
import IPython
from IPython.display import HTML, display

import math
from decimal import Decimal

import tabulate # conda install tabulate

Some Theory (Loosely based on "Springer, Mathematische Formeln")

The position system

Each real number $x$ can be written in the following form:

$$ x = x_m B^m + x_{m-1} B^{m-1} + x_0 B^0 + x_{-1} B^{-1} + \dots = (x_m x_{m-1}\dots x_1 x_0.x_{-1}\dots)_B $$

where the integer $B>1$ is called the base and the digit $x_i \in \{0,\dots,B-1\}$. If $B>10$, the the decimal numbering is typically extended by adding alphabetic characters.

Naming convention diverge from here on in the literature I have found, so for the sake of this work we will define the following:

  • The integer part of $x$ is defined as $(x_m x_{m-1}\dots x_1 x_0)_B$
  • The fractional part of $x$ is defined as $(x_{-1}\dots)_B$

Division operators in Python 3

Classic or True Division: x/y

Python 3 changed the definition of this operation. It now always returns a float and tries to approximate the true divison result according to IEEE specifications.

In [2]:
tmp = 4/2;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = 5/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = -4/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = -5/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4.1/3.7;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
Integer divison:	 2.0 , result type:  <class 'float'>
Integer divison:	 1.3333333333333333 , result type:  <class 'float'>
Integer divison:	 1.6666666666666667 , result type:  <class 'float'>
Integer divison:	 -1.3333333333333333 , result type:  <class 'float'>
Integer divison:	 -1.6666666666666667 , result type:  <class 'float'>
Integer divison:	 1.108108108108108 , result type:  <class 'float'>

Floor Division: x//y

Floor typically denotes a round down operation, i.e. it rounds to the biggest integer, which is smaller or equal than the number itself. Hence we get a different result for positive and neagite numbers as shown inthe example below. This will prove to be annoing for any conversion algorythm relying on this operation.

In [3]:
tmp = 4//2;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = 5//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = -4//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = -5//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4.1//3.7;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
Floor divison:	 2 , result type:  <class 'int'>
Floor divison:	 1 , result type:  <class 'int'>
Floor divison:	 1 , result type:  <class 'int'>
Floor divison:	 -2 , result type:  <class 'int'>
Floor divison:	 -2 , result type:  <class 'int'>
Floor divison:	 1.0 , result type:  <class 'float'>

The modulo operation: x%y

Now this is the tricky one. Inaccurately speaking, it is the remainder of the true division. Wikipedia has a better description, have fun ...

In [4]:
url = 'https://en.wikipedia.org/wiki/Modulo_operation'
iframe = '<iframe src=' + url + ' width=100% height=350></iframe>'
IPython.display.HTML(iframe)
Out[4]:
In [5]:
tmp = 4%2;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = 5%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = -4%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = -5%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4.1%3.7;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
Modulo divison:	 0 , result type:  <class 'int'>
Modulo divison:	 1 , result type:  <class 'int'>
Modulo divison:	 2 , result type:  <class 'int'>
Modulo divison:	 2 , result type:  <class 'int'>
Modulo divison:	 1 , result type:  <class 'int'>
Modulo divison:	 0.39999999999999947 , result type:  <class 'float'>

divmod

In [6]:
for a,b in zip((4,4,5,-4,-5,4.1),(2,3,3,3,3,3.7)):
    print('a='+str(a),'\t,b='+str(b),',\tdivmod(a,b)='+str(divmod(a,b)))
a=4 	,b=2 ,	divmod(a,b)=(2, 0)
a=4 	,b=3 ,	divmod(a,b)=(1, 1)
a=5 	,b=3 ,	divmod(a,b)=(1, 2)
a=-4 	,b=3 ,	divmod(a,b)=(-2, 2)
a=-5 	,b=3 ,	divmod(a,b)=(-2, 1)
a=4.1 	,b=3.7 ,	divmod(a,b)=(1.0, 0.39999999999999947)

So it looks as if divmod sotres the float divison and modul division in a tuple.

Summary

And here we have it alltogether. I also got bored of the six lines, wo here is a little loop which does this a lot more elegantly

In [7]:
mytable=[]
mytable.append(('a','b','a/b','a//b','a%b','divmod(a,b)'))
for a,b in zip((4,4,5,-4,-5,4.1),(2,3,3,3,3,3.7)):
    mytable.append((a,b,a/b,a//b,a%b,divmod(a,b)))

display(HTML(tabulate.tabulate(mytable, tablefmt='html')))
a b a/b a//ba%b divmod(a,b)
4 2 2.0 2 0 (2, 0)
4 3 1.3333333333333333 1 1 (1, 1)
5 3 1.6666666666666667 1 2 (1, 2)
-4 3 -1.3333333333333333-2 2 (-2, 2)
-5 3 -1.6666666666666667-2 1 (-2, 1)
4.13.71.108108108108108 1.0 0.39999999999999947(1.0, 0.39999999999999947)

Operations on floats

This is actually more tricky than it sounds, as rounding errors and number finite number representaions start to play a role.

In [8]:
mytable=[]
mytable.append(('a','int(a)','round(a)','floor(a)','modf(a)','a%1','Decimal Magic','String Conversion'))
for a in (4,4.1,4.5,4.6,-4.1,-4.5,-4.6):
    if "." in str(a):
        number_dec = str(a-int(a)).split('.')[1]
    else:
        number_dec = 0
    mytable.append((a,int(a),round(a),math.floor(a),math.modf(a),a%1,Decimal.from_float(a)%1,number_dec))

display(HTML(tabulate.tabulate(mytable, tablefmt='html')))
a int(a)round(a)floor(a)modf(a) a%1 Decimal Magic String Conversion
4 4 4 4 (0.0, 4.0) 0 0 0
4.1 4 4 4 (0.09999999999999964, 4.0) 0.099999999999999640.09999999999999964472863211995 09999999999999964
4.5 4 4 4 (0.5, 4.0) 0.5 0.5 5
4.6 4 5 4 (0.5999999999999996, 4.0) 0.5999999999999996 0.5999999999999996447286321199 5999999999999996
-4.1-4 -4 -5 (-0.09999999999999964, -4.0)0.9000000000000004 -0.0999999999999996447286321199509999999999999964
-4.5-4 -4 -5 (-0.5, -4.0) 0.5 -0.5 5
-4.6-4 -5 -5 (-0.5999999999999996, -4.0) 0.40000000000000036-0.5999999999999996447286321199 5999999999999996

Translation Algorithms

As the division and rounding algorithm behave slghtly differently for positive and negative numbers, we will treat only positive numbers. If we encounter a negative number, we will multiply by $-1$, apply the conversion algorithm, and then multiply by $-1$ again.

($B\to 10$)

Straigth forward, based on the equation above.

$(10\to B)$

Algorithm (Base Conversion)

Let $X=(y_m y_{m-1}\dots y_0.z_{-1}z_{-2}\dots)_{10}$ be a positive real number in the decimal system. We will treat the integer part $Y=(y_m y_{m-1}\dots y_0)$ and fractional part $Z=(z_{-1}z_{-2}\dots)$ individually.

Problem: Find \begin{align*} \tilde X=(\tilde y_n \tilde y_{n-1}\dots \tilde y_0.\tilde z_{-1} \tilde z_{-2}\dots)_{B} \text{ such that } X_{10}=X_B \end{align*}

**Algorithm (Integer Part Y)**

**Initialize**: $\tilde Y_0 = Y$ , $i=0$
**repeat**
\begin{align*} \tilde y_i &= \tilde Y_i \mod B \\ \tilde Y_{i+1} &=\tilde Y_{i}/B - \tilde y_i/B \\ i &=i+1 \end{align*} **until** $\tilde Y_i=0$
**Algorithm (Fractional Part Z)**

**Initialize**: $\tilde Z_0 = Z$ , $i=0$
**repeat**
\begin{align*} \tilde z_i &= \text{integer}(\tilde Z_i * B) \\ \tilde Z_{i+1} &=\tilde Z_{i}*B - \tilde y_i \\ i &=i+1 \end{align*} **until** $\tilde Z_i$ is whole or number of relevant digits is reached.

Implementation

Function to convert integer part

In [9]:
def base10_to_newbase_integer(number=5326, base=16):
    '''
    Converts a integer into a list of digits corresponding to a new base.
    
    Parameters
    ----------
    number : int (5326)
        Base 10 input integer
    base   : int (16)
        New base
        
    Yields
    ------
    list
        List of digits in the new base.
    
    Notes
    -----
    I did not add the usual convention to replace the digits above 9 with traditional alphabet notation. 
    This is work for another day.
    
    '''
    tmp = []
    factor = 1
    if number < 0:
        factor = -1
        number = -number
    
    while True:
        if number < base:
            tmp.append(number)
            break
        else:
            number, remainder = divmod(number,base)
            tmp.append(int(remainder))
    tmp.reverse()
    tmp[0] *= factor
    return tmp

A few testruns:

In [10]:
mytable = []
mytable.append(('a','my converter (B=8)','oct(8)','my converter (B=16)','oct(16)'))
for a in [12345,-12345,7,8,15,16]:
    mytable.append((a
                    ,base10_to_newbase_integer(number=a,base=8),oct(a)
                    ,base10_to_newbase_integer(number=a,base=16),hex(a)))
display(HTML(tabulate.tabulate(mytable, tablefmt='html')))
a my converter (B=8)oct(8) my converter (B=16)oct(16)
12345 [3, 0, 0, 7, 1] 0o30071 [3, 0, 3, 9] 0x3039
-12345[-3, 0, 0, 7, 1] -0o30071[-3, 0, 3, 9] -0x3039
7 [7] 0o7 [7] 0x7
8 [1, 0] 0o10 [8] 0x8
15 [1, 7] 0o17 [15] 0xf
16 [2, 0] 0o20 [1, 0] 0x10

Function to convert fractional part

In [11]:
def base10_to_newbase_fraction(number=0.6789, base=16,ndigits=5):
    '''
    Converts the fractional part of a irrational number from base 10 to base B
    
    Parameters
    ----------
    number : float (0.6789)
        Base 10 float 0<= number < 1
    base   : int (16)
        New base
    ndigits : int (5)
        Number of significant digits
        
    Yields
    ------
    list
        List of digits in the new base.
    
    Notes
    -----
    I did not add the usual convention to replace the digits above 9 with traditional alphabet notation. 
    This is work for another day.
    
    I alos truncate at the maximum number of digits, not round.
    
    '''
    tmp = []
    digits = 0

    while True:
        number, integer = math.modf(number*base)
        digits += 1
        if number == 0 or digits > ndigits:
            break
        else:
            tmp.append(int(integer))
    return tmp
In [12]:
mytable = []
mytable.append(('a','my converter (B=8)','my converter (B=16)'))
for a in [0.6789,-0.6789,0.,0.1,0.2]:
    mytable.append((a
                    ,base10_to_newbase_fraction(number=a,base=8)
                    ,base10_to_newbase_fraction(number=a,base=16)))
display(HTML(tabulate.tabulate(mytable, tablefmt='html')))
a my converter (B=8) my converter (B=16)
0.6789 [5, 3, 3, 4, 6] [10, 13, 12, 12, 6]
-0.6789[-5, -3, -3, -4, -6][-10, -13, -12, -12, -6]
0.0 [] []
0.1 [0, 6, 3, 1, 4] [1, 9, 9, 9, 9]
0.2 [1, 4, 6, 3, 1] [3, 3, 3, 3, 3]

Now full float conversion:

In [13]:
def base10_to_newbase(number=12345.6789, base=16,ndigits=5):
    '''
    Converts the a  float base 10 to base B
    
    Parameters
    ----------
    number : float (12345.6789)
        Base 10 float 
    base   : int (16)
        New base
    ndigits : int (5)
        Number of significant fractional digits
        
    Yields
    ------
    integer
        List of integer digits in the new base.
    fractional
        List of fractional digits in the new base.
        
    
    Notes
    -----
    I did not add the usual convention to replace the digits above 9 with traditional alphabet notation. 
    This is work for another day.
    
    I alos truncate at the maximum number of digits, not round.
    
    '''
    fractional, integer = math.modf(number)
    
    return (base10_to_newbase_integer(number=integer,base=base),
            base10_to_newbase_fraction(abs(fractional),base=base,ndigits=ndigits))
In [14]:
mytable = []
mytable.append(('a','my converter (B=8)','my converter (B=16)'))
for a in [12345,0.6789,12345.6789,-12345.6789,]:
    mytable.append((a
                    ,base10_to_newbase(number=a,base=8)
                    ,base10_to_newbase(number=a,base=16)))
display(HTML(tabulate.tabulate(mytable, tablefmt='html')))
a my converter (B=8) my converter (B=16)
12345 ([3.0, 0, 0, 7, 1], []) ([3.0, 0, 3, 9], [])
0.6789 ([0.0], [5, 3, 3, 4, 6]) ([0.0], [10, 13, 12, 12, 6])
12345.6789 ([3.0, 0, 0, 7, 1], [5, 3, 3, 4, 6]) ([3.0, 0, 3, 9], [10, 13, 12, 12, 6])
-12345.6789([-3.0, 0, 0, 7, 1], [5, 3, 3, 4, 6])([-3.0, 0, 3, 9], [10, 13, 12, 12, 6])

The results of the oct conversion are identical to the textbook solution.