This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.
import numpy as np import tables as tb
Let's create a new empty HDF5 file.
f = tb.open_file('myfile.h5', 'w')
We create a new top-level group named "experiment1".
Let's also add some metadata to this group.
f.set_node_attr('/experiment1', 'date', '2014-09-01')
In this group, we create a 1000*1000 array named "array1".
x = np.random.rand(1000, 1000) f.create_array('/experiment1', 'array1', x)
Finally, we need to close the file to commit the changes on disk.
f = tb.open_file('myfile.h5', 'r')
We can retrieve an attribute by giving the group path and the attribute name.
We can access any item in the file using attributes. IPython's tab completion is incredibly useful in this respect when exploring a file interactively.
y = f.root.experiment1.array1 type(y)
The array can be used as a NumPy array, but an important distinction is that it is stored on disk instead of system memory. Performing a computation on this array triggers a preliminary loading of the array in memory, so that it is more efficient to only access views on this array.
It is also possible to get a node from its absolute path, which is useful when this path is only known at runtime.
import os os.remove('myfile.h5')