pint
¶This notebook goes with a blog post on the same subject.
Have you ever wished you could carry units around with your quantities — and have the computer figure out the best units and multipliers to use?
pint
is a nince, compact library for doing just this, handling all your dimensional analysis needs. It can also detect units from strings. We can define our own units, it knows about multipliers (kilo, mega, etc), and it even works with numpy
and pandas
.
Install pint
with pip
or conda
, e.g.
pip install pint
NB If you are running this on Google Colaboratory, you must uncomment these lines (delete the initial #
) and run this first:
#!pip install pint
#!pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0
To use it in its typical mode, we import the library then instantiate a UnitRegistry
object. The registry contains lots of physical units.
import pint
units = pint.UnitRegistry()
pint.__version__
'0.9'
thickness = 68 * units.m
thickness
In a Jupyter Notebook you see a 'pretty' version of the quantity. In the interpreter, you'll see something slightly different (the so-called repr
of the class):
>>> thickness
<Quantity(68, 'meter')>
We can get at the magnitude, the units, and the dimensionality of this quantity:
thickness.magnitude, thickness.units, thickness.dimensionality
(68, <Unit('meter')>, <UnitsContainer({'[length]': 1.0})>)
You can also use the following abbreviations for magnitude and units:
thickness.m, thickness.u
For printing, we can use Python's string formatting:
f'{thickness**2}'
'4624 meter ** 2'
But pint
extends the string formatting options to include special options for Quantity
objects. The most useful option is P
for 'pretty', but there's also L
for $\LaTeX$ and H
for HTML. Adding a ~
(tilde) before the option tells pint
to use unit abbreviations instead of the full names:
print(f'{thickness**2:P}')
print(f'{thickness**2:~P}')
print(f'{thickness**2:~L}')
print(f'{thickness**2:~H}')
4624 meter² 4624 m² 4624\ \mathrm{m}^{2} 4624 m<sup>2</sup>
If we multiply by a scalar, pint
produces the result you'd expect:
thickness * 2
Note that you must use units when you need them:
thickness + 10
# This is meant to produce an error...
--------------------------------------------------------------------------- DimensionalityError Traceback (most recent call last) <ipython-input-10-ecd15f6b9f4c> in <module> ----> 1 thickness + 10 2 3 # This is meant to produce an error... ~/anaconda3/envs/xlines/lib/python3.7/site-packages/pint/quantity.py in __add__(self, other) 752 return self.to_timedelta() + other 753 else: --> 754 return self._add_sub(other, operator.add) 755 756 __radd__ = __add__ ~/anaconda3/envs/xlines/lib/python3.7/site-packages/pint/quantity.py in wrapped(self, *args, **kwargs) 73 elif isinstance(other, list) and isinstance(other[0], type(self)): 74 return NotImplemented ---> 75 result = f(self, *args, **kwargs) 76 return result 77 return wrapped ~/anaconda3/envs/xlines/lib/python3.7/site-packages/pint/quantity.py in _add_sub(self, other, op) 663 _to_magnitude(other, self.force_ndarray)) 664 else: --> 665 raise DimensionalityError(self._units, 'dimensionless') 666 return self.__class__(magnitude, units) 667 DimensionalityError: Cannot convert from 'meter' to 'dimensionless'
Let's try defining an area of $60\ \mathrm{km}^2$, then multiplying it by our thickness. To make it more like a hydrocarbon volume, I'll also multiply by net:gross n2g
, porosity phi
, and saturation sat
, all of which are dimensionless:
area = 60 * units.km**2
n2g = 0.5 * units.dimensionless # Optional dimensionless 'units'...
phi = 0.2 # ... but you can just do this.
sat = 0.7
volume = area * thickness * n2g * phi * sat
volume
We can convert to something more compact:
volume.to_compact()
Or be completely explicit about the units and multipliers we want:
volume.to('m**3') # Or use m^3
The to_compact()
method can also take units, if you want to be more explicit; it applies multipliers automatically:
volume.to_compact('L')
Oil barrels are already defined (careful, they are abbreviated as oil_bbl
not bbl
— that's a 31.5 gallon barrel, about the same as a beer barrel).
volume.to_compact('oil_barrel')
If we use string formatting (see above), we can get pretty specific:
f"The volume is {volume.to_compact('oil_barrel'):~0.2fL}"
'The volume is 1.80\\ \\mathrm{Goil_bbl}'
pint
defines hundreads of units (here's the list), and it knows about tonnes of oil equivalent... but it doesn't know about barrels of oil equivalent (for more on conversion to BOE). So let's define a custom unit, using the USGS's conversion factor:
units.define('barrel_of_oil_equivalent = 6000 ft**3 = boe')
Let's suspend reality for a moment and imagine we now want to compute our gross rock volume in BOEs...
volume.to('boe')
volume.to_compact('boe')
pint
can also parse strings and attempt to convert them to Quantity
instances:
units('2.34 km')
This looks useful! Let's try something less nicely formatted.
units('2.34*10^3 km')
units('-12,000.ft')
units('3.2 m')
You can also use the Quantity
constructor, like this:
>>> qty = pint.Quantity
>>> qty('2.34 km')
2.34 kilometer
But the UnitRegistry
seems to do the same things and might be more convenient.
pint
with uncertainties
¶Conveniently, pint
works well with uncertainties
. Maybe I'll do an X lines on that package in the future. Install it with conda
or pip
, e.g.
pip install uncertainties
from uncertainties import ufloat
area = ufloat(64, 5) * units.km**2 # 64 +/- 5 km**2
(thickness * area).to('Goil_bbl')
pint
with numpy
¶pint
works fine with NumPy arrays:
import numpy as np
vp = np.array([2300, 2400, 2550, 3200]) * units.m/units.s
rho = np.array([2400, 2550, 2500, 2650]) * units.kg/units.m**3
z = vp * rho
z
For some reason, this sometimes doesn't render properly. But we can always do this:
print(z)
[5520000. 6120000. 6375000. 8480000.] kilogram / meter ** 2 / second
As expected, the magnitude of this quantity is just a NumPy array:
z.m
array([5520000., 6120000., 6375000., 8480000.])
pint
with pandas
¶Note that this functionality is fairly new and is still settling down. YMMV.
To use pint
(version 0.9 and later) with pandas
(version 0.24.2 works; 0.25.0 does not work at the time of writing), we must first install pintpandas
, which must be done from source; get the code from GitHub. Here's how I do it:
cd pint-pandas
python setup.py sdist
pip install dist/Pint-Pandas-0.1.dev0.tar.gz
You could also do:
pip install git+https://github.com/hgrecco/pint-pandas#egg=Pint-Pandas-0.1.dev0
Once you have done that, the following should evaluate to True
:
pint._HAS_PINTPANDAS
True
To use this integration, we pass special pint
data types to the pd.Series()
object:
import pandas as pd
df = pd.DataFrame({
"Vp": pd.Series(vp.m, dtype="pint[m/s]"),
"Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
"rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})
df
Vp | Vs | rho | |
---|---|---|---|
0 | 2300.0 | 1200 | 2400.0 |
1 | 2400.0 | 1200 | 2550.0 |
2 | 2550.0 | 1250 | 2500.0 |
3 | 3200.0 | 1300 | 2650.0 |
import bruges as bg
df['E'] = bg.rockphysics.moduli.youngs(df.Vp, df.Vs, df.rho)
df.E
0 9075366233.766233 1 9792000000.0 2 10483220521.25506 3 12550276023.391813 Name: E, dtype: pint[kilogram / meter / second ** 2]
We can't convert the units of a whole Series
but we can do one:
df.loc[0, 'E'].to('GPa')
So to convert a whole series, we can use Series.apply()
:
df.E.apply(lambda x: x.to('GPa'))
0 9.075366233766236 gigapascal 1 9.792000000000003 gigapascal 2 10.483220521255063 gigapascal 3 12.550276023391817 gigapascal Name: E, dtype: object
We could subclass dataframes to tweak their _repr_html_()
method, which would allow us to make units show up in the Notebook representation of the dataframe...
class UnitDataFrame(pd.DataFrame):
def _repr_html_(self):
"""New repr for Jupyter Notebook."""
html = super()._repr_html_() # Get the old repr string.
units = [''] + [f"{dtype.units:~H}" for dtype in self.dtypes]
style = "text-align: right; color: gray;"
new = f'<tr style="{style}"><th>' + "</th><th>".join(units) + "</th></tr></thead>"
return html.replace('</thead>', new)
df = UnitDataFrame({
"Vp": pd.Series(vp.m, dtype="pint[m/s]"),
"Vs": pd.Series([1200, 1200, 1250, 1300], dtype="pint[m/s]"),
"rho": pd.Series(rho.m, dtype="pint[kg/m**3]"),
})
df
Vp | Vs | rho | |
---|---|---|---|
m/s | m/s | kg/m3 | |
0 | 2300.0 | 1200 | 2400.0 |
1 | 2400.0 | 1200 | 2550.0 |
2 | 2550.0 | 1250 | 2500.0 |
3 | 3200.0 | 1300 | 2650.0 |
Cute.
© Agile Scientific 2019, licensed CC-BY