This notebook shows how `giotto-tda`

can be used to create topological features for time series forecasting tasks, and how to integrate them into `scikit-learn`

â€“compatible pipelines.

In particular, we will concentrate on topological features which are created from consecutive **sliding windows** over the data. In sliding window models, a single time series array `X`

of shape `(n_timestamps, n_features)`

is turned into a time series of windows over the data, with a new shape `(n_windows, n_samples_per_window, n_features)`

. There are two main issues that arise when building forecasting models with sliding windows:

`n_windows`

is smaller than`n_timestamps`

. This is because we cannot have more windows than there are timestamps without padding`X`

, and this is not done by`giotto-tda`

.`n_timestamps - n_windows`

is even larger if we decide to pick a large stride between consecutive windows.- The target variable
`y`

needs to be properly "aligned" with each window so that the forecasting problem is meaningful and e.g. we don't "leak" information from the future. In particular,`y`

needs to be "resampled" so that it too has length`n_windows`

.

To deal with these issues, `giotto-tda`

provides a selection of transformers with `resample`

, `transform_resample`

and `fit_transform_resample`

methods. These are inherited from a `TransformerResamplerMixin`

base class. Furthermore, `giotto-tda`

provides a drop-in replacement for `scikit-learn`

's `Pipeline`

which extends it to allow chaining `TransformerResamplerMixin`

s with regular `scikit-learn`

estimators.

If you are looking at a static version of this notebook and would like to run its contents, head over to GitHub and download the source.

- Topology of time series which explains how transforming time series into point clouds via the
*Takens embedding*procedure makes topological feature extraction possible. - Topological feature extraction using VietorisRipsPersistence and PersistenceEntropy for a quick introduction to general topological feature extraction in
`giotto-tda`

.

**License: AGPLv3**

`SlidingWindow`

¶Let us start with a simple example of a "time series" `X`

with a corresponding target `y`

of the same length.

In [ ]:

```
import numpy as np
n_timestamps = 10
X, y = np.arange(n_timestamps), np.arange(n_timestamps) - n_timestamps
X, y
```

We can instantiate our sliding window transformer-resampler and run it on the pair `(X, y)`

:

In [ ]:

```
from gtda.time_series import SlidingWindow
window_size = 3
stride = 2
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X, y)
X_sw, yr
```

We note a couple of things:

`fit_transform_resample`

returns a pair: the window-transformed`X`

and the resampled and aligned`y`

.`SlidingWindow`

has made a choice for us on how to resample`y`

and line it up with the windows from`X`

: a window on`X`

corresponds to the*last*value in a corresponding window over`y`

. This is common in time series forecasting where, for example,`y`

could be a shift of`X`

by one timestamp.- Some of the initial values of
`X`

may not be found in`X_sw`

. This is because`SlidingWindow`

only ensures the*last*value is represented in the last window, regardless of the stride.

`Pipeline`

¶`giotto-tda`

's topology transformers expect 3D input. But our `X_sw`

above is 2D. How do we capture interesting properties of the topology of input time series then? For univariate time series, it turns out that a good way is to use the "time delay embedding" or "Takens embedding" technique explained in the first part of Topology of time series. But as this involves an extra layer of complexity, we leave it for later and concentrate for now on an example with a simpler API which also demonstrates the use of a `giotto-tda`

`Pipeline`

.

Surprisingly, this involves multivariate time series input!

In [ ]:

```
rng = np.random.default_rng(42)
n_features = 2
X = rng.integers(0, high=20, size=(n_timestamps, n_features), dtype=int)
X
```

We are interpreting this input as a time series in two variables, of length `n_timestamps`

. The target variable is the same `y`

as before.

In [ ]:

```
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X, y)
X_sw, yr
```

`X_sw`

is now a complicated-looking array, but it has a simple interpretation. Again, `X_sw[i]`

is the `i`

-th window on `X`

, and it contains `window_size`

samples from the original time series. This time, the samples are not scalars but 1D arrays.

What if we suspect that the way in which the **correlations** between the variables evolve over time can help forecast the target `y`

? This is a common situation in neuroscience, where each variable could be data from a single EEG sensor, for instance.

`giotto-tda`

exposes a `PearsonDissimilarity`

transformer which creates a 2D dissimilarity matrix from each window in `X_sw`

, and stacks them together into a single 3D object. This is the correct format (and information content!) for a typical topological transformer in `gtda.homology`

. See also Topological feature extraction from graphs for an in-depth look. Finally, we can extract simple scalar features using a selection of transformers in `gtda.diagrams`

.

In [ ]:

```
from gtda.time_series import PearsonDissimilarity
from gtda.homology import VietorisRipsPersistence
from gtda.diagrams import Amplitude
PD = PearsonDissimilarity()
X_pd = PD.fit_transform(X_sw)
VR = VietorisRipsPersistence(metric="precomputed")
X_vr = VR.fit_transform(X_pd) # "precomputed" required on dissimilarity data
Ampl = Amplitude()
X_a = Ampl.fit_transform(X_vr)
X_a
```

Notice that we are not acting on `y`

above. We are simply creating features from each window using topology! *Note*: it's two features per window because we used the default value for `homology_dimensions`

in `VietorisRipsPersistence`

, not because we had two variables in the time series initially!

We can now put this all together into a `giotto-tda`

`Pipeline`

which combines both the sliding window transformation on `X`

and resampling of `y`

with the feature extraction from the windows on `X`

.

*Note*: while we could import the `Pipeline`

class and use its constructor, we use the convenience function `make_pipeline`

instead, which is a drop-in replacement for scikit-learn's.

In [ ]:

```
from sklearn import set_config
set_config(display='diagram') # For HTML representations of pipelines
from gtda.pipeline import make_pipeline
pipe = make_pipeline(SW, PD, VR, Ampl)
pipe
```

Finally, if we have a *regression* task on `y`

we can add a final estimator such as scikit-learn's `RandomForestRegressor`

as a final step in the previous pipeline, and fit it!

In [ ]:

```
from sklearn.ensemble import RandomForestRegressor
RFR = RandomForestRegressor()
pipe = make_pipeline(SW, PD, VR, Ampl, RFR)
pipe
```

In [ ]:

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

`TakensEmbedding`

and `SingleTakensEmbedding`

¶The first part of Topology of time series explains a commonly used technique for converting a univariate time series into a single **point cloud**. Since topological features can be extracted from any point cloud, this is a gateway to time series analysis using topology. The second part of that notebook shows how to transform a *batch* of time series into a batch of point clouds, and how to extract topological descriptors from each of them independently. While in that notebook this is applied to a time series classification task, in this notebook we are concerned with topology-powered *forecasting* from a single time series.

Reasoning by analogy with the multivariate case above, we can look at sliding windows over `X`

as small time series in their own right and track the evolution of *their* topology against the variable of interest (or against itself, if we are interested in unsupervised tasks such as anomaly detection).

There are two ways in which we can implement this idea in `giotto-tda`

:

- We can first apply a
`SlidingWindow`

, and then an instance of`TakensEmbedding`

. - We can
*first*compute a global Takens embedding of the time series via`SingleTakensEmbedding`

, which takes us from 1D/column data to 2D data, and*then*partition the 2D data of vectors into sliding windows via`SlidingWindow`

.

The first route ensures that we can run our "topological feature extraction track" in parallel with other feature-generation pipelines from sliding windows, without experiencing shape mismatches. The second route seems a little upside-down and it is not generally recommended, but it has the advantange that globally "optimal" parameters for the "time delay" and "embedding dimension" parameters can be computed automatically by `SingleTakensEmbedding`

.

Below is what each route would look like.

*Remark:* In the presence of noise, a small sliding window size is likely to reduce the reliability of the estimate of the time series' local topology.

`SlidingWindow`

+ `TakensEmbedding`

¶`TakensEmbedding`

is not a `TransformerResamplerMixin`

, but this is not a problem in the context of a `Pipeline`

when we order things in this way.

In [ ]:

```
from gtda.time_series import TakensEmbedding
X = np.arange(n_timestamps)
window_size = 5
stride = 2
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X, y)
X_sw, yr
```

In [ ]:

```
time_delay = 1
dimension = 2
TE = TakensEmbedding(time_delay=time_delay, dimension=dimension)
X_te = TE.fit_transform(X_sw)
X_te
```

In [ ]:

```
VR = VietorisRipsPersistence() # No "precomputed" for point clouds
Ampl = Amplitude()
RFR = RandomForestRegressor()
pipe = make_pipeline(SW, TE, VR, Ampl, RFR)
pipe
```

In [ ]:

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

`SingleTakensEmbeding`

+ `SlidingWindow`

¶Note that `SingleTakensEmbedding`

is also a `TransformerResamplerMixin`

, and that the logic for resampling/aligning `y`

is the same as in `SlidingWindow`

.

In [ ]:

```
from gtda.time_series import SingleTakensEmbedding
X = np.arange(n_timestamps)
STE = SingleTakensEmbedding(parameters_type="search", time_delay=2, dimension=3)
X_ste, yr = STE.fit_transform_resample(X, y)
X_ste, yr
```

In [ ]:

```
window_size = 5
stride = 2
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X_ste, yr)
X_sw, yr
```

From here on, it is easy to push a very similar pipeline through as in the multivariate case:

In [ ]:

```
VR = VietorisRipsPersistence() # No "precomputed" for point clouds
Ampl = Amplitude()
RFR = RandomForestRegressor()
pipe = make_pipeline(STE, SW, VR, Ampl, RFR)
pipe
```

In [ ]:

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

The best results are obtained when topological methods are used not in isolation but in **combination** with other methods. Here's an example where, in parallel with the topological feature extraction from local sliding windows using **Option 2** above, we also compute the mean and variance in each sliding window. A `scikit-learn`

`FeatureUnion`

is used to combine these very different sets of features into a single pipeline object.

In [ ]:

```
from functools import partial
from sklearn.preprocessing import FunctionTransformer
from sklearn.pipeline import FeatureUnion
from sklearn.base import clone
mean = FunctionTransformer(partial(np.mean, axis=1, keepdims=True))
var = FunctionTransformer(partial(np.var, axis=1, keepdims=True))
pipe_topology = make_pipeline(TE, VR, Ampl)
feature_union = FeatureUnion([("window_mean", mean),
("window_variance", var),
("window_topology", pipe_topology)])
pipe = make_pipeline(SW, feature_union, RFR)
pipe
```

In [ ]:

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

`Labeller`

¶Let us say that we simply wish to predict the future of a time series from itself. This is very common in the study of financial markets for example. `giotto-tda`

provides convenience classes for target preparation from a time series. This notebook only shows a very simple example: many more options are described in `Labeller`

's documentation.

If we wished to create a target `y`

from `X`

such that `y[i]`

is equal to `X[i + 1]`

, while also modifying `X`

and `y`

so that they still have the same length, we could proceed as follows:

In [ ]:

```
from gtda.time_series import Labeller
X = np.arange(10)
Lab = Labeller(size=1, func=np.max)
Xl, yl = Lab.fit_transform_resample(X, X)
Xl, yl
```

Notice that we are feeding two copies of `X`

to `fit_transform_resample`

in this case!

This is what fitting an end-to-end pipeline for future prediction using topology could look like. Again, you are encouraged to include your own non-topological features in the mix!

In [ ]:

```
SW = SlidingWindow(size=5)
TE = TakensEmbedding(time_delay=1, dimension=2)
VR = VietorisRipsPersistence()
Ampl = Amplitude()
RFR = RandomForestRegressor()
# Full pipeline including the regressor
pipe = make_pipeline(Lab, SW, TE, VR, Ampl, RFR)
pipe
```

In [ ]:

```
pipe.fit(X, X)
y_pred = pipe.predict(X)
y_pred
```

- There are two additional simple
`TransformerResamplerMixin`

s in`gtda.time_series`

:`Resampler`

and`Stationarizer`

. The sort of pipeline for topological feature extraction using Takens embedding is a bit crude. More sophisticated methods exist for extracting robust topological summaries from (windows on) time series. A good source of inspiration is the following paper:

Persistent Homology of Complex Networks for Dynamic State Detection, by A. Myers, E. Munch, and F. A. Khasawneh.

The module

`gtda.graphs`

contains several transformers implementing the main algorithms proposed there.- Advanced users may be interested in
`ConsecutiveRescaling`

, which can be found in`gtda.point_clouds`

. - The notebook Lorenz attractor is an advanced use-case for
`TakensEmbedding`

and other time series forecasting techniques inspired by topology.