giotto-tda includes a set of plotting functions and class methods, powered by
plotly. The library's plotting API is designed to facilitate the exploration of intermediate results in pipelines by harnessing the highly visual nature of topological signatures.
This notebook is a quick tutorial on how to use
giotto-tda's plotting functionalities and unified plotting API. The plotting functions in
gtda.mapper are not covered here as they are somewhat tailored to the Mapper algorithm, see the dedicated tutorial.
If you are looking at a static version of this notebook and would like to run its contents, head over to GitHub and download the source.
The computational building blocks of
scikit-learn–style estimators. Typically, they are also transformers, i.e. they possess a
transform and/or a
fit-transform method which:
Xwhich collects a certain number of "samples" of a given kind;
Xtwhich collects a (potentially different) number of "samples" of a potentially different kind.
The basic philosophy of
giotto-tda's class-level plotting API is to equip relevant transformers with
plot methods taking two main arguments:
Xtabove (i.e. consistent with the outputs of
samplekeyword and indicating which sample in
Xtshould be plotted.
In other words,
<transformer>.plot(Xt, sample=i) will produce a plot of
Xt[i] which is tailored to the nature of the samples in
plot methods in
giotto-tda actually fall back to specialised functions which can be found in the plotting subpackage and which can be used directly instead. However, unless the additional degree of control is necessary,
plot methods should be preferred as they often exploit class parameters and/or attributes (e.g. those computed during
fit) to automatically fill some parameters in the corresponding functions.
Let's take the example of
VietorisRipsPersistence – a transformer also covered in another notebook. Let's create the input collection
X for this transformer as a collection of randomly generated point clouds, each containing 100 points positioned along two circles.
import numpy as np np.random.seed(seed=42) from gtda.homology import VietorisRipsPersistence from sklearn.datasets import make_circles X = np.asarray([ make_circles(100, factor=np.random.random()) for i in range(10) ])
Incidentally, samples in
X can be plotted using
from gtda.plotting import plot_point_cloud i = 0 plot_point_cloud(X[i])
Let us instantiate a
VietorisRipsTransformer object, and call the
fit-transform method on
X to obtain the transformed object
VR = VietorisRipsPersistence() Xt = VR.fit_transform(X)
For any sample index i,
Xt[i] is a two-dimensional array encoding the multi-scale topological information which can be extracted from the i-th point cloud
It is typically too difficult to get a quick idea of the interesting information contained in
Xt[i] by looking at the array directly. This information is best displayed as a so-called "persistence diagram" in 2D. The
plot method of our
VietorisRipsPersistence instance achieves precisely this:
In the case of
plot is a thin wrapper around the function
gtda.plotting.plot_diagram, so the same result could have been achieved by importing that function and calling
In the diagram, each point indicates a topological feature in the data which appears at a certain "birth" scale and remains present all the way up to a later "death" scale. A point's distance from the diagonal is directly proportional to the difference between the point's "death" and its "birth". Hence, this distance visually communicates how "persistent" the associated topological feature is. Topological features are partitioned by dimension using colors: above, features in dimension 0 are red while those in dimension 1 are green. In dimension 0, the diagram describes connectivity structure in the data in a very similar way to linkage clustering: we see three points along the vertical axis, which are in one-to-one correspondence with "merge" events in the sense of hierarchical clustering. In dimension 1, the diagram describes the presence of "independent" one-dimensional holes in the data: as expected, there are only two significant points, corresponding to the two "persistent" circles.
giotto-tda transformers which have a
plot method can also implement the two derived methods
This method takes two main arguments:
Xabove (i.e. consistent with the inputs of
The logic of
transform_plot can be roughly described as follows: first, the sample
X[i] is transformed; then, the result is plotted using
plot and returned. [More technically: we first create a trivial collection
X_sing = [X[i]], which contains a single sample from
X. Then, we compute
Xt_sing = <transformer>.transform(X_sing). Assuming
Xt_sing contains a single transformed sample, we call
<transformer>.plot(Xt_sing, sample=0), and also return
In the example of Section 1.2, we would do:
VR = VietorisRipsPersistence() VR.fit(X) VR.transform_plot(X, sample=i);
This method is equivalent to first fitting the transformer using
X (and, optionally, a target variable
y), and then calling
X and a given sample index.
The workflow in the example of Section 1.2 can be simplified even further, turning the entire process into a simple one-liner:
VR = VietorisRipsPersistence() VR.fit_transform_plot(X, sample=i);