Conversion between numerical base systems

by

Andreas Putz

I created ths notebook after a tricky job interview question:

"How would you convert a integer in the base 10 system to the hexadecimal system ?"

So, my formal programming training has been a while, and while this is a textbook question, I have not pondered it for a decade or more. My answer was some tentative looping and some hint that the modulo operator might be helpful. So after wondering about this for a while, I decided to put some thought into it. And here it is.

Note: This is not meant as a rigorous scientific acrticle, but a short treatise on the topic. Divison is a tricky IEEE operation for non-symbolic calculations. There are a lot of pitfalls for a naive implementation.

In [1]:

import IPython
from IPython.display import HTML, display

import math
from decimal import Decimal

import tabulate # conda install tabulate

Some Theory (Loosely based on "Springer, Mathematische Formeln")¶

The position system¶

Each real number $x$ can be written in the following form:

$$ x = x_m B^m + x_{m-1} B^{m-1} + x_0 B^0 + x_{-1} B^{-1} + \dots = (x_m x_{m-1}\dots x_1 x_0.x_{-1}\dots)_B $$

where the integer $B>1$ is called the base and the digit $x_i \in \{0,\dots,B-1\}$. If $B>10$, the the decimal numbering is typically extended by adding alphabetic characters.

Naming convention diverge from here on in the literature I have found, so for the sake of this work we will define the following:

The integer part of $x$ is defined as $(x_m x_{m-1}\dots x_1 x_0)_B$
The fractional part of $x$ is defined as $(x_{-1}\dots)_B$

Division operators in Python 3¶

Classic or True Division: x/y¶

Python 3 changed the definition of this operation. It now always returns a float and tries to approximate the true divison result according to IEEE specifications.

In [2]:

tmp = 4/2;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = 5/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = -4/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = -5/3;print('Integer divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4.1/3.7;print('Integer divison:\t',tmp, ', result type: ', type(tmp))

Integer divison:	 2.0 , result type:  <class 'float'>
Integer divison:	 1.3333333333333333 , result type:  <class 'float'>
Integer divison:	 1.6666666666666667 , result type:  <class 'float'>
Integer divison:	 -1.3333333333333333 , result type:  <class 'float'>
Integer divison:	 -1.6666666666666667 , result type:  <class 'float'>
Integer divison:	 1.108108108108108 , result type:  <class 'float'>

Floor Division: x//y¶

Floor typically denotes a round down operation, i.e. it rounds to the biggest integer, which is smaller or equal than the number itself. Hence we get a different result for positive and neagite numbers as shown inthe example below. This will prove to be annoing for any conversion algorythm relying on this operation.

In [3]:

tmp = 4//2;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = 5//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = -4//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = -5//3;print('Floor divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4.1//3.7;print('Floor divison:\t',tmp, ', result type: ', type(tmp))

Floor divison:	 2 , result type:  <class 'int'>
Floor divison:	 1 , result type:  <class 'int'>
Floor divison:	 1 , result type:  <class 'int'>
Floor divison:	 -2 , result type:  <class 'int'>
Floor divison:	 -2 , result type:  <class 'int'>
Floor divison:	 1.0 , result type:  <class 'float'>

The modulo operation: x%y¶

Now this is the tricky one. Inaccurately speaking, it is the remainder of the true division. Wikipedia has a better description, have fun ...

In [4]:

url = 'https://en.wikipedia.org/wiki/Modulo_operation'
iframe = '<iframe src=' + url + ' width=100% height=350></iframe>'
IPython.display.HTML(iframe)

Out[4]:

In [5]:

tmp = 4%2;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = 5%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = -4%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = -5%3;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))
tmp = 4.1%3.7;print('Modulo divison:\t',tmp, ', result type: ', type(tmp))

Modulo divison:	 0 , result type:  <class 'int'>
Modulo divison:	 1 , result type:  <class 'int'>
Modulo divison:	 2 , result type:  <class 'int'>
Modulo divison:	 2 , result type:  <class 'int'>
Modulo divison:	 1 , result type:  <class 'int'>
Modulo divison:	 0.39999999999999947 , result type:  <class 'float'>

divmod¶

In [6]:

for a,b in zip((4,4,5,-4,-5,4.1),(2,3,3,3,3,3.7)):
    print('a='+str(a),'\t,b='+str(b),',\tdivmod(a,b)='+str(divmod(a,b)))

a=4 	,b=2 ,	divmod(a,b)=(2, 0)
a=4 	,b=3 ,	divmod(a,b)=(1, 1)
a=5 	,b=3 ,	divmod(a,b)=(1, 2)
a=-4 	,b=3 ,	divmod(a,b)=(-2, 2)
a=-5 	,b=3 ,	divmod(a,b)=(-2, 1)
a=4.1 	,b=3.7 ,	divmod(a,b)=(1.0, 0.39999999999999947)

So it looks as if divmod sotres the float divison and modul division in a tuple.

Summary¶

And here we have it alltogether. I also got bored of the six lines, wo here is a little loop which does this a lot more elegantly

In [7]:

mytable=[]
mytable.append(('a','b','a/b','a//b','a%b','divmod(a,b)'))
for a,b in zip((4,4,5,-4,-5,4.1),(2,3,3,3,3,3.7)):
    mytable.append((a,b,a/b,a//b,a%b,divmod(a,b)))

display(HTML(tabulate.tabulate(mytable, tablefmt='html')))

a	b	a/b	a//b	a%b	divmod(a,b)
4	2	2.0	2	0	(2, 0)
4	3	1.3333333333333333	1	1	(1, 1)
5	3	1.6666666666666667	1	2	(1, 2)
-4	3	-1.3333333333333333	-2	2	(-2, 2)
-5	3	-1.6666666666666667	-2	1	(-2, 1)
4.1	3.7	1.108108108108108	1.0	0.39999999999999947	(1.0, 0.39999999999999947)

Operations on floats¶

This is actually more tricky than it sounds, as rounding errors and number finite number representaions start to play a role.

In [8]:

mytable=[]
mytable.append(('a','int(a)','round(a)','floor(a)','modf(a)','a%1','Decimal Magic','String Conversion'))
for a in (4,4.1,4.5,4.6,-4.1,-4.5,-4.6):
    if "." in str(a):
        number_dec = str(a-int(a)).split('.')[1]
    else:
        number_dec = 0
    mytable.append((a,int(a),round(a),math.floor(a),math.modf(a),a%1,Decimal.from_float(a)%1,number_dec))

display(HTML(tabulate.tabulate(mytable, tablefmt='html')))

a	int(a)	round(a)	floor(a)	modf(a)	a%1	Decimal Magic	String Conversion
4	4	4	4	(0.0, 4.0)	0	0	0
4.1	4	4	4	(0.09999999999999964, 4.0)	0.09999999999999964	0.09999999999999964472863211995	09999999999999964
4.5	4	4	4	(0.5, 4.0)	0.5	0.5	5
4.6	4	5	4	(0.5999999999999996, 4.0)	0.5999999999999996	0.5999999999999996447286321199	5999999999999996
-4.1	-4	-4	-5	(-0.09999999999999964, -4.0)	0.9000000000000004	-0.09999999999999964472863211995	09999999999999964
-4.5	-4	-4	-5	(-0.5, -4.0)	0.5	-0.5	5
-4.6	-4	-5	-5	(-0.5999999999999996, -4.0)	0.40000000000000036	-0.5999999999999996447286321199	5999999999999996

Translation Algorithms¶

As the division and rounding algorithm behave slghtly differently for positive and negative numbers, we will treat only positive numbers. If we encounter a negative number, we will multiply by $-1$, apply the conversion algorithm, and then multiply by $-1$ again.

($B\to 10$)¶

Straigth forward, based on the equation above.

$(10\to B)$¶

Algorithm (Base Conversion)

Let $X=(y_m y_{m-1}\dots y_0.z_{-1}z_{-2}\dots)_{10}$ be a positive real number in the decimal system. We will treat the integer part $Y=(y_m y_{m-1}\dots y_0)$ and fractional part $Z=(z_{-1}z_{-2}\dots)$ individually.

Problem: Find \begin{align*} \tilde X=(\tilde y_n \tilde y_{n-1}\dots \tilde y_0.\tilde z_{-1} \tilde z_{-2}\dots)_{B} \text{ such that } X_{10}=X_B \end{align*}

**Algorithm (Integer Part Y)**

**Initialize**: $\tilde Y_0 = Y$ , $i=0$
**repeat**

\begin{align*}
\tilde y_i &= \tilde Y_i \mod B \\
\tilde Y_{i+1} &=\tilde Y_{i}/B - \tilde y_i/B \\
i &=i+1
\end{align*}

until $\tilde Y_i=0$

**Algorithm (Fractional Part Z)**

**Initialize**: $\tilde Z_0 = Z$ , $i=0$
**repeat**

\begin{align*}
\tilde z_i &= \text{integer}(\tilde Z_i * B) \\
\tilde Z_{i+1} &=\tilde Z_{i}*B - \tilde y_i \\
i &=i+1
\end{align*}

until $\tilde Z_i$ is whole or number of relevant digits is reached.

Implementation¶

Function to convert integer part¶

In [9]:

def base10_to_newbase_integer(number=5326, base=16):
    '''
    Converts a integer into a list of digits corresponding to a new base.
    
    Parameters
    ----------
    number : int (5326)
        Base 10 input integer
    base   : int (16)
        New base
        
    Yields
    ------
    list
        List of digits in the new base.
    
    Notes
    -----
    I did not add the usual convention to replace the digits above 9 with traditional alphabet notation. 
    This is work for another day.
    
    '''
    tmp = []
    factor = 1
    if number < 0:
        factor = -1
        number = -number
    
    while True:
        if number < base:
            tmp.append(number)
            break
        else:
            number, remainder = divmod(number,base)
            tmp.append(int(remainder))
    tmp.reverse()
    tmp[0] *= factor
    return tmp

A few testruns:

In [10]:

mytable = []
mytable.append(('a','my converter (B=8)','oct(8)','my converter (B=16)','oct(16)'))
for a in [12345,-12345,7,8,15,16]:
    mytable.append((a
                    ,base10_to_newbase_integer(number=a,base=8),oct(a)
                    ,base10_to_newbase_integer(number=a,base=16),hex(a)))
display(HTML(tabulate.tabulate(mytable, tablefmt='html')))

a	my converter (B=8)	oct(8)	my converter (B=16)	oct(16)
12345	[3, 0, 0, 7, 1]	0o30071	[3, 0, 3, 9]	0x3039
-12345	[-3, 0, 0, 7, 1]	-0o30071	[-3, 0, 3, 9]	-0x3039
7	[7]	0o7	[7]	0x7
8	[1, 0]	0o10	[8]	0x8
15	[1, 7]	0o17	[15]	0xf
16	[2, 0]	0o20	[1, 0]	0x10

Function to convert fractional part¶

In [11]:

def base10_to_newbase_fraction(number=0.6789, base=16,ndigits=5):
    '''
    Converts the fractional part of a irrational number from base 10 to base B
    
    Parameters
    ----------
    number : float (0.6789)
        Base 10 float 0<= number < 1
    base   : int (16)
        New base
    ndigits : int (5)
        Number of significant digits
        
    Yields
    ------
    list
        List of digits in the new base.
    
    Notes
    -----
    I did not add the usual convention to replace the digits above 9 with traditional alphabet notation. 
    This is work for another day.
    
    I alos truncate at the maximum number of digits, not round.
    
    '''
    tmp = []
    digits = 0

    while True:
        number, integer = math.modf(number*base)
        digits += 1
        if number == 0 or digits > ndigits:
            break
        else:
            tmp.append(int(integer))
    return tmp

In [12]:

mytable = []
mytable.append(('a','my converter (B=8)','my converter (B=16)'))
for a in [0.6789,-0.6789,0.,0.1,0.2]:
    mytable.append((a
                    ,base10_to_newbase_fraction(number=a,base=8)
                    ,base10_to_newbase_fraction(number=a,base=16)))
display(HTML(tabulate.tabulate(mytable, tablefmt='html')))

a	my converter (B=8)	my converter (B=16)
0.6789	[5, 3, 3, 4, 6]	[10, 13, 12, 12, 6]
-0.6789	[-5, -3, -3, -4, -6]	[-10, -13, -12, -12, -6]
0.0	[]	[]
0.1	[0, 6, 3, 1, 4]	[1, 9, 9, 9, 9]
0.2	[1, 4, 6, 3, 1]	[3, 3, 3, 3, 3]

Now full float conversion:¶

In [13]:

def base10_to_newbase(number=12345.6789, base=16,ndigits=5):
    '''
    Converts the a  float base 10 to base B
    
    Parameters
    ----------
    number : float (12345.6789)
        Base 10 float 
    base   : int (16)
        New base
    ndigits : int (5)
        Number of significant fractional digits
        
    Yields
    ------
    integer
        List of integer digits in the new base.
    fractional
        List of fractional digits in the new base.
        
    
    Notes
    -----
    I did not add the usual convention to replace the digits above 9 with traditional alphabet notation. 
    This is work for another day.
    
    I alos truncate at the maximum number of digits, not round.
    
    '''
    fractional, integer = math.modf(number)
    
    return (base10_to_newbase_integer(number=integer,base=base),
            base10_to_newbase_fraction(abs(fractional),base=base,ndigits=ndigits))

In [14]:

mytable = []
mytable.append(('a','my converter (B=8)','my converter (B=16)'))
for a in [12345,0.6789,12345.6789,-12345.6789,]:
    mytable.append((a
                    ,base10_to_newbase(number=a,base=8)
                    ,base10_to_newbase(number=a,base=16)))
display(HTML(tabulate.tabulate(mytable, tablefmt='html')))

a	my converter (B=8)	my converter (B=16)
12345	([3.0, 0, 0, 7, 1], [])	([3.0, 0, 3, 9], [])
0.6789	([0.0], [5, 3, 3, 4, 6])	([0.0], [10, 13, 12, 12, 6])
12345.6789	([3.0, 0, 0, 7, 1], [5, 3, 3, 4, 6])	([3.0, 0, 3, 9], [10, 13, 12, 12, 6])
-12345.6789	([-3.0, 0, 0, 7, 1], [5, 3, 3, 4, 6])	([-3.0, 0, 3, 9], [10, 13, 12, 12, 6])

The results of the oct conversion are identical to the textbook solution.