Note: Click on "Kernel" > "Restart Kernel and Clear All Outputs" in JupyterLab before reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it in the cloud .

Chapter 5: Numbers & Bits (continued)¶

In this second part of the chapter, we look at the float type in detail. It is probably the most commonly used one in all of data science, even across programming languages.

The `float` Type¶

As we have seen before, some assumptions need to be made as to how the $0$ s and $1$ s in a computer's memory are to be translated into numbers. This process becomes a lot more involved when we go beyond integers and model real numbers (i.e., the set $\mathbb{R}$ ) with possibly infinitely many digits to the right of the period like $1.23$ .

The Institute of Electrical and Electronics Engineers (IEEE, pronounced "eye-triple-E") is one of the important professional associations when it comes to standardizing all kinds of aspects regarding the implementation of soft- and hardware.

The IEEE 754 standard defines the so-called floating-point arithmetic that is commonly used today by all major programming languages. The standard not only defines how the $0$ s and $1$ s are organized in memory but also, for example, how values are to be rounded, what happens in exceptional cases like divisions by zero, or what is a zero value in the first place.

In Python, the simplest way to create a float object is to use a literal notation with a dot . in it.

In [1]:

b = 42.0

In [2]:

id(b)

Out[2]:

139923238853936

In [3]:

type(b)

Out[3]:

float

In [4]:

Out[4]:

42.0

As with int literals, we may use underscores _ to make longer float objects easier to read.

In [5]:

0.123_456_789

Out[5]:

0.123456789

In cases where the dot . is unnecessary from a mathematical point of view, we either need to end the number with it nevertheless or use the float() built-in to cast the number explicitly. float() can process any numeric object or a properly formatted str object.

In [6]:

42.

Out[6]:

42.0

In [7]:

float(42)

Out[7]:

42.0

In [8]:

float("42")

Out[8]:

42.0

Leading and trailing whitespace is ignored ...

In [9]:

float(" 42.87 ")

Out[9]:

42.87

... but not whitespace in between.

In [10]:

float("42. 87")

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 1
----> 1 float("42. 87")

ValueError: could not convert string to float: '42. 87'

float objects are implicitly created as the result of dividing an int object by another with the division operator /.

In [11]:

1 / 3

Out[11]:

0.3333333333333333

In general, if we combine float and int objects in arithmetic operations, we always end up with a float type: Python uses the "broader" representation.

In [12]:

40.0 + 2

Out[12]:

42.0

In [13]:

21 * 2.0

Out[13]:

42.0

Scientific Notation¶

float objects may also be created with the scientific literal notation: We use the symbol e to indicate powers of $10$ , so $1.23 * 10^0$ translates into 1.23e0.

In [14]:

1.23e0

Out[14]:

1.23

Syntactically, e needs a float or int object in its literal notation on its left and an int object on its right, both without a space. Otherwise, we get a SyntaxError.

In [15]:

1.23 e0

  Cell In[15], line 1
    1.23 e0
         ^
SyntaxError: invalid syntax

In [16]:

1.23e 0

  Cell In[16], line 1
    1.23e 0
       ^
SyntaxError: invalid decimal literal

In [17]:

1.23e0.0

  Cell In[17], line 1
    1.23e0.0
          ^
SyntaxError: invalid syntax

If we leave out the number to the left, Python raises a NameError as it unsuccessfully tries to look up a variable named e0.

In [18]:

e0

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[18], line 1
----> 1 e0

NameError: name 'e0' is not defined

So, to write $10^0$ in Python, we need to think of it as $1*10^0$ and write 1e0.

In [19]:

1e0

Out[19]:

1.0

To express thousands of something (i.e., $10^3$ ), we write 1e3.

In [20]:

1e3  # = thousands

Out[20]:

1000.0

Similarly, to express, for example, milliseconds (i.e., $10^{-3} s$ ), we write 1e-3.

In [21]:

1e-3  # = milli

Out[21]:

0.001

Special Values¶

There are also three special values representing "not a number," called nan, and positive or negative infinity, called inf or -inf, that are created by passing in the corresponding abbreviation as a str object to the float() built-in. These values could be used, for example, as the result of a mathematically undefined operation like division by zero or to model the value of a mathematical function as it goes to infinity.

In [22]:

float("nan")  # also float("NaN")

Out[22]:

nan

In [23]:

float("+inf")  # also float("+infinity") or float("infinity")

Out[23]:

inf

In [24]:

float("inf")  # also float("+inf")

Out[24]:

inf

In [25]:

float("-inf")

Out[25]:

-inf

nan objects never compare equal to anything, not even to themselves. This happens in accordance with the IEEE 754 standard.

In [26]:

float("nan") == float("nan")

Out[26]:

False

Another caveat is that any arithmetic involving a nan object results in nan. In other words, the addition below fails silently as no error is raised. As this also happens in accordance with the IEEE 754 standard, we need to be aware of that and check any data we work with for any nan occurrences before doing any calculations.

In [27]:

42 + float("nan")

Out[27]:

nan

On the contrary, as two values go to infinity, there is no such concept as difference and everything compares equal.

In [28]:

float("inf") == float("inf")

Out[28]:

True

Adding 42 to inf makes no difference.

In [29]:

float("inf") + 42

Out[29]:

inf

In [30]:

float("inf") + 42 == float("inf")

Out[30]:

True

We observe the same for multiplication ...

In [31]:

42 * float("inf")

Out[31]:

inf

In [32]:

42 * float("inf") == float("inf")

Out[32]:

True

... and even exponentiation!

In [33]:

float("inf") ** 42

Out[33]:

inf

In [34]:

float("inf") ** 42 == float("inf")

Out[34]:

True

Although absolute differences become unmeaningful as we approach infinity, signs are still respected.

In [35]:

-42 * float("-inf")

Out[35]:

inf

In [36]:

-42 * float("-inf") == float("inf")

Out[36]:

True

As a caveat, adding infinities of different signs is an undefined operation in math and results in a nan object. So, if we (accidentally or unknowingly) do this on a real dataset, we do not see any error messages, and our program may continue to run with non-meaningful results! This is another example of a piece of code failing silently.

In [37]:

float("inf") + float("-inf")

Out[37]:

nan

In [38]:

float("inf") - float("inf")

Out[38]:

nan

Imprecision¶

float objects are inherently imprecise, and there is nothing we can do about it! In particular, arithmetic operations with two float objects may result in "weird" rounding "errors" that are strictly deterministic and occur in accordance with the IEEE 754 standard.

For example, let's add 1 to 1e15 and 1e16, respectively. In the latter case, the 1 somehow gets "lost."

In [39]:

1e15 + 1

Out[39]:

1000000000000001.0

In [40]:

1e16 + 1

Out[40]:

1e+16

Interactions between sufficiently large and small float objects are not the only source of imprecision.

In [41]:

from math import sqrt

In [42]:

sqrt(2) ** 2

Out[42]:

2.0000000000000004

In [43]:

0.1 + 0.2

Out[43]:

0.30000000000000004

This may become a problem if we rely on equality checks in our programs.

In [44]:

sqrt(2) ** 2 == 2

Out[44]:

False

In [45]:

0.1 + 0.2 == 0.3

Out[45]:

False

A popular workaround is to benchmark the absolute value of the difference between the two numbers to be checked for equality against a pre-defined threshold sufficiently close to 0, for example, 1e-15.

In [46]:

threshold = 1e-15

In [47]:

abs((sqrt(2) ** 2) - 2) < threshold

Out[47]:

True

In [48]:

abs((0.1 + 0.2) - 0.3) < threshold

Out[48]:

True

The built-in format() function allows us to show the significant digits of a float number as they exist in memory to arbitrary precision. To exemplify it, let's view a couple of float objects with 50 digits. This analysis reveals that almost no float number is precise! After 14 or 15 digits "weird" things happen. As we see further below, the "random" digits ending the float numbers do not "physically" exist in memory! Rather, they are "calculated" by the format() function that is forced to show 50 digits.

The format() function is different from the format() method on str objects introduced in the next chapter (cf., Chapter 6 ): Yet, both work with the so-called format specification mini-language : ".50f" is the instruction to show 50 digits of a float number.

In [49]:

format(0.1, ".50f")

Out[49]:

'0.10000000000000000555111512312578270211815834045410'

In [50]:

format(0.2, ".50f")

Out[50]:

'0.20000000000000001110223024625156540423631668090820'

In [51]:

format(0.3, ".50f")

Out[51]:

'0.29999999999999998889776975374843459576368331909180'

In [52]:

format(1 / 3, ".50f")

Out[52]:

'0.33333333333333331482961625624739099293947219848633'

The format() function does not round a float object in the mathematical sense! It just allows us to show an arbitrary number of the digits as stored in memory, and it also does not change these.

On the contrary, the built-in round() function creates a new numeric object that is a rounded version of the one passed in as the argument. It adheres to the common rules of math.

For example, let's round 1 / 3 to five decimals. The obtained value for roughly_a_third is also imprecise but different from the "exact" representation of 1 / 3 above.

In [53]:

roughly_a_third = round(1 / 3, 5)

In [54]:

roughly_a_third

Out[54]:

0.33333

In [55]:

format(roughly_a_third, ".50f")

Out[55]:

'0.33333000000000001517008740847813896834850311279297'

Surprisingly, 0.125 and 0.25 appear to be precise, and equality comparison works without the threshold workaround: Both are powers of $2$ in disguise.

In [56]:

format(0.125, ".50f")

Out[56]:

'0.12500000000000000000000000000000000000000000000000'

In [57]:

format(0.25, ".50f")

Out[57]:

'0.25000000000000000000000000000000000000000000000000'

In [58]:

0.125 + 0.125 == 0.25

Out[58]:

True

Binary Representations¶

To understand these subtleties, we need to look at the binary representation of floats and review the basics of the IEEE 754 standard. On modern machines, floats are modeled in so-called double precision with $64$ bits that are grouped as in the figure below. The first bit determines the sign ( $0$ for plus, $1$ for minus), the next $11$ bits represent an $exponent$ term, and the last $52$ bits resemble the actual significant digits, the so-called $fraction$ part. The three groups are put together like so:

$float = (-1)^{sign} * 1.fraction * 2^{exponent-1023}$

A $1.$ is implicitly prepended as the first digit, and both, $fraction$ and $exponent$ , are stored in base $2$ representation (i.e., they both are interpreted like integers above). As $exponent$ is consequently non-negative, between $0_{10}$ and $2047_{10}$ to be precise, the $-1023$ , called the exponent bias, centers the entire $2^{exponent-1023}$ term around $1$ and allows the period within the $1.fraction$ part be shifted into either direction by the same amount. Floating-point numbers received their name as the period, formally called the radix point , "floats" along the significant digits. As an aside, an $exponent$ of all $0$ s or all $1$ s is used to model the special values nan or inf.

As the standard defines the exponent part to come as a power of $2$ , we now see why 0.125 is a precise float: It can be represented as a power of $2$ , i.e., $0.125 = (-1)^0 * 1.0 * 2^{1020-1023} = 2^{-3} = \frac{1}{8}$ . In other words, the floating-point representation of $0.125_{10}$ is $0_2$ , $1111111100_2 = 1020_{10}$ , and $0_2$ for the three groups, respectively.

The crucial fact for the data science practitioner to understand is that mapping the infinite set of the real numbers $\mathbb{R}$ to a finite set of bits leads to the imprecisions shown above!

So, floats are usually good approximations of real numbers only with their first $14$ or $15$ digits. If more precision is required, we need to revert to other data types such as a Decimal or a Fraction, as shown in the next two sections.

This blog post gives another neat and visual way as to how to think of floats. It also explains why floats become worse approximations of the reals as their absolute values increase.

The Python documentation provides another good discussion of floats and the goodness of their approximations.

If we are interested in the exact bits behind a float object, we use the .hex() method that returns a str object beginning with "0x1." followed by the $fraction$ in hexadecimal notation and the $exponent$ as an integer after subtraction of $1023$ and separated by a "p".

In [59]:

one_eighth = 1 / 8

In [60]:

one_eighth.hex()

Out[60]:

'0x1.0000000000000p-3'

Also, the .as_integer_ratio() method returns the two smallest integers whose ratio best approximates a float object.

In [61]:

one_eighth.as_integer_ratio()

Out[61]:

(1, 8)

In [62]:

roughly_a_third.hex()

Out[62]:

'0x1.555475a31a4bep-2'

In [63]:

roughly_a_third.as_integer_ratio()

Out[63]:

(3002369727582815, 9007199254740992)

0.0 is also a power of $2$ and thus a precise float number.

In [64]:

zero = 0.0

In [65]:

zero.hex()

Out[65]:

'0x0.0p+0'

In [66]:

zero.as_integer_ratio()

Out[66]:

(0, 1)

As seen in Chapter 1 , the .is_integer() method tells us if a float can be casted as an int object without any loss in precision.

In [67]:

roughly_a_third.is_integer()

Out[67]:

False

In [68]:

one = roughly_a_third / roughly_a_third

one.is_integer()

Out[68]:

True

As the exact implementation of floats may vary and be dependent on a particular Python installation, we look up the .float_info attribute in the sys module in the standard library to check the details. Usually, this is not necessary.

In [69]:

import sys

In [70]:

sys.float_info

Out[70]:

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)