Getting started with MovingPandas¶

This notebook demonstrates the current development version of MovingPandas.

For tutorials using the latest release visit https://github.com/movingpandas/movingpandas-examples.

In [ ]:

import urllib
import os
import pandas as pd
import geopandas as gpd
from geopandas import GeoDataFrame, read_file
from shapely.geometry import Point, LineString, Polygon
from fiona.crs import from_epsg
from datetime import datetime, timedelta
from matplotlib import pyplot as plt

import sys
sys.path.append("..")
import movingpandas as mpd
mpd.show_versions()

import warnings
#warnings.simplefilter("ignore")
from movingpandas.trajectory import TimeZoneWarning
warnings.filterwarnings("ignore", category=TimeZoneWarning)

Creating a trajectory from scratch¶

In [ ]:

df = pd.DataFrame([
  {'geometry':Point(0,0), 't':datetime(2018,1,1,12,0,0)},
  {'geometry':Point(6,0), 't':datetime(2018,1,1,12,6,0)},
  {'geometry':Point(6,6), 't':datetime(2018,1,1,12,10,0)},
  {'geometry':Point(9,9), 't':datetime(2018,1,1,12,15,0)}
]).set_index('t')
geo_df = GeoDataFrame(df, crs=31256)
toy_traj = mpd.Trajectory(geo_df, 1)
toy_traj.df

In [ ]:

toy_traj.to_line_gdf()

In [ ]:

toy_traj.to_traj_gdf(wkt=True)

We can access key information about our trajectory by looking at the print output:

In [ ]:

print(toy_traj)

Another useful piece of information is the sampling interval (median time difference between records):

In [ ]:

toy_traj.get_sampling_interval()

We can also access the trajectories GeoDataFrame:

In [ ]:

toy_traj.df

Processing trajectories¶

We can compute the distance, speed, and acceleration of movement along the trajectory (between consecutive points). The default distance units are meters (or CRS units, if the CRS units are not known or specified), and the default time units are seconds:

In [ ]:

toy_traj.add_distance(overwrite=True).df

In [ ]:

toy_traj.add_speed(overwrite=True).df

In [ ]:

toy_traj.add_acceleration(overwrite=True).df

If you want to use different units, you can specify them. Allowed units include metric units from mm to km, imperial units from inch to mile, nautical miles, and non-standard units which are used as CRS distance units e.g. US Survey units.

In [ ]:

toy_traj.add_distance(overwrite=True, name="distance (km)", units="km")
toy_traj.add_distance(overwrite=True, name="distance (yards)", units="yd")
toy_traj.add_speed(overwrite=True, name="speed (ft/min)", units=("ft", "min"))
toy_traj.add_speed(overwrite=True, name="speed (knots)", units=("nm", "h"))
toy_traj.add_acceleration(overwrite=True, name="acceleration (mph/s)", units=("mi", "h", "s"))
toy_traj.df

Visualizing trajectories¶

To visualize the trajectory, we can turn it into a linestring.

(The notebook environment automatically plots Shapely geometry objects like the LineString returned by to_linestring().)

In [ ]:

toy_traj.to_linestring()

We can also visualize the speed values:

In [ ]:

toy_traj.plot(column="speed", linewidth=5, capstyle='round', legend=True)

In contrast to the earlier example where we visualized the whole trajectory as one linestring, the trajectory plot() function draws each line segment individually and thus each can have a different color.

In [ ]:

hvplot_defaults = {'tiles':None, 'frame_height':320, 'frame_width':320, 'cmap':'Viridis', 'colorbar':True}
toy_traj.hvplot(c="speed", **hvplot_defaults)

Analyzing trajectories¶

Extracting a moving object's position was at a certain time¶

For example, let's have a look at the get_position_at() function:

In [ ]:

toy_traj.get_position_at(datetime(2018,1,1,12,6,0), method="nearest")    

To see its coordinates, we can look at the print output:

In [ ]:

print(toy_traj.get_position_at(datetime(2018,1,1,12,6,0), method="nearest"))

The method parameter describes what the function should do if there is no entry in the trajectory GeoDataFrame for the specified timestamp.

For example, there is no entry at 2018-01-01 12:07:00

In [ ]:

toy_traj.df

In [ ]:

print(toy_traj.get_position_at(datetime(2018,1,1,12,7,0), method="nearest"))
print(toy_traj.get_position_at(datetime(2018,1,1,12,7,0), method="interpolated"))
print(toy_traj.get_position_at(datetime(2018,1,1,12,7,0), method="ffill")) # from the previous row
print(toy_traj.get_position_at(datetime(2018,1,1,12,7,0), method="bfill")) # from the following row

If the timestamp falls outside the time range between trajectory start and end time, we get an error:

In [ ]:

try: 
    toy_traj.get_position_at(datetime(2018,1,1,13,0,0))
except ValueError as e:
    print(f"ValueError: {e}")

Measuring distances to other objects¶

In [ ]:

pt = Point(1, 5)
line = LineString([(3,3), (3,9)])
pt_gdf = GeoDataFrame(pd.DataFrame([{'geometry':pt, 'id':1}]))
line_gdf = GeoDataFrame(pd.DataFrame([{'geometry':line, 'id':1}]))

ax = toy_traj.plot()
pt_gdf.plot(ax=ax, color='red')
line_gdf.plot(ax=ax, color='red')

In [ ]:

print(f'Distance: {toy_traj.distance(pt)}')
print(f'Hausdorff distance: {toy_traj.hausdorff_distance(pt):.2f}')

In [ ]:

pt.distance(Point(9, 9))

In [ ]:

print(f'Distance: {toy_traj.distance(line)}')
print(f'Hausdorff distance: {toy_traj.hausdorff_distance(line)}')

Extracting trajectory segments based on time or geometry (i.e. clipping)¶

First, let's extract the trajectory segment for a certain time period:

In [ ]:

segment = toy_traj.get_segment_between(datetime(2018,1,1,12,6,0),datetime(2018,1,1,12,12,0))
print(segment)

In [ ]:

ax = toy_traj.plot()
segment.plot(ax=ax, color='red', linewidth=5)

Now, let's extract the trajectory segment that intersects with a given polygon:

In [ ]:

xmin, xmax, ymin, ymax = 2, 8, -10, 5
polygon = Polygon([(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin), (xmin, ymin)])
polygon_gdf = GeoDataFrame(pd.DataFrame([{'geometry':polygon, 'id':1}]), crs=4326)

In [ ]:

intersections = toy_traj.clip(polygon)
intersections

In [ ]:

ax = toy_traj.plot()
polygon_gdf.plot(ax=ax, color='lightgray')
intersections.plot(ax=ax, color='red', linewidth=5)

Beyond toy trajectories: Loading trajectory data from GeoPackage¶

The MovingPandas repository contains a demo GeoPackage file that can be loaded as follows:

In [ ]:

%%time
gdf = read_file('data/demodata_geolife.gpkg')
print("Finished reading {} rows".format(len(df)))

After reading the trajectory point data from file, we want to construct the trajectories.

Creating trajectories with TrajectoryCollection¶

TrajectoryCollection is a convenience class that takes care of creating trajectories from a GeoDataFrame:

In [ ]:

traj_collection = mpd.TrajectoryCollection(gdf, 'trajectory_id', t='t')
print(traj_collection)

In [ ]:

traj_collection.plot(column='trajectory_id', legend=True, figsize=(9,5))

Converting TrajectoryCollections back to GeoDataFrames¶

In [ ]:

traj_collection.to_point_gdf()

In [ ]:

traj_collection.to_line_gdf()

In [ ]:

traj_collection.to_traj_gdf(wkt=True)

These GeoDataFrames can be exported to different file formats using GeoPandas, as documented in https://geopandas.org/docs/user_guide/io.html

In [ ]:

traj_collection.to_traj_gdf(wkt=True).to_file("temp.gpkg", layer='trajectories', driver="GPKG")

In [ ]:

read_file('temp.gpkg').plot()

Visualizing real-world trajectories¶

In [ ]:

my_traj = traj_collection.trajectories[1]
print(my_traj)

In [ ]:

plot_defaults = {'linewidth':5, 'capstyle':'round', 'figsize':(9,3), 'legend':True}
my_traj.plot(column='speed', vmax=20, **plot_defaults)

To visualize trajectories in their geographical context, we can also create interactive plots with basemaps:

In [ ]:

hvplot_defaults = {'tiles':'CartoLight', 'frame_height':400, 'frame_width':700, 'cmap':'Viridis', 'colorbar':True}
my_hvplot = my_traj.hvplot(c='speed', line_width=7.0, clim=(0,20), **hvplot_defaults)
my_hvplot

And even put other layers on top:

In [ ]:

my_hvplot * gpd.GeoDataFrame([my_traj.get_row_at(datetime(2009,6,29,8,0,0))]).hvplot(geo=True, size=200, color='red')

Trajectory manipulation and handling¶

Finding intersections with a Shapely polygon¶

The clip function can be used to extract trajectory segments that are located within an area of interest polygon.

This is how to use clip on a list of Trajectory objects:

In [ ]:

xmin, xmax, ymin, ymax = 116.3685035,116.3702945,39.904675,39.907728
polygon = Polygon([(xmin,ymin), (xmin,ymax), (xmax,ymax), (xmax,ymin), (xmin,ymin)])

intersections = []
for traj in traj_collection:
    for intersection in traj.clip(polygon):
        intersections.append(intersection)
print("Found {} intersections".format(len(intersections)))

In [ ]:

intersections[2].plot(linewidth=5.0, capstyle='round')

Alternatively, using TrajectoryCollection:

In [ ]:

clipped = traj_collection.clip(polygon)
clipped.trajectories[2].plot(linewidth=5.0, capstyle='round')

Splitting trajectories¶

Gaps are quite common in trajectories. For example, GPS tracks may contain gaps if moving objects enter tunnels where GPS reception is lost. In other use cases, moving objects may leave the observation area for longer time before returning and continuing their recorded track.

Depending on the use case, we therefore might want to split trajectories at observation gaps that exceed a certain minimum duration:

In [ ]:

my_traj = traj_collection.trajectories[1]
print(my_traj)
my_traj.plot(column='speed', vmax=20, **plot_defaults)

In [ ]:

split = mpd.ObservationGapSplitter(my_traj).split(gap=timedelta(minutes=5))
for traj in split:
    print(traj)

In [ ]:

fig, axes = plt.subplots(nrows=1, ncols=len(split), figsize=(19,4))
for i, traj in enumerate(split):
    traj.plot(ax=axes[i], column='speed', vmax=20, **plot_defaults)

In [ ]:

split = mpd.StopSplitter(my_traj).split(min_duration=timedelta(minutes=1), max_diameter=30, min_length=500)
for traj in split:
    print(traj)

In [ ]:

fig, axes = plt.subplots(nrows=1, ncols=len(split), figsize=(19,4))
for i, traj in enumerate(split):
    traj.plot(ax=axes[i], column='speed', vmax=20, **plot_defaults)

Alternatively, using the whole TrajectoryCollection:

In [ ]:

split = mpd.ObservationGapSplitter(traj_collection).split(gap=timedelta(minutes=15))
print(split)
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(16,4))
traj_collection.plot(ax=axes[0], column='trajectory_id', **plot_defaults)
axes[0].set_title('Original Trajectories')
split.plot(ax=axes[1], column='trajectory_id', **plot_defaults)
axes[1].set_title('Split Trajectories')

Generalizing trajectories¶

To reduce the size of trajectory objects, we can generalize them, for example, using the Douglas-Peucker algorithm:

In [ ]:

original_traj = traj_collection.trajectories[1]
print(original_traj)

In [ ]:

original_traj.plot(column='speed', vmax=20, **plot_defaults)

Try different tolerance settings and observe the results in line geometry and therefore also length:

Spatial generalizers¶

In [ ]:

dp_generalized = mpd.DouglasPeuckerGeneralizer(original_traj).generalize(tolerance=0.001)
dp_generalized.plot(column='speed', vmax=20, **plot_defaults)

In [ ]:

print('Original length: %s'%(original_traj.get_length()))
print('Generalized length: %s'%(dp_generalized.get_length()))

An alternative generalization method is to down-sample the trajectory to ensure a certain time delta between records:

Temporal generalizers¶

In [ ]:

time_generalized = mpd.MinTimeDeltaGeneralizer(original_traj).generalize(tolerance=timedelta(minutes=1))
time_generalized.plot(column='speed', vmax=20, **plot_defaults)

In [ ]:

time_generalized.to_point_gdf().head(10)

In [ ]:

original_traj.to_point_gdf().head(10)

Spatiotemporal generalizers¶

In [ ]:

tdtr_generalized = mpd.TopDownTimeRatioGeneralizer(original_traj).generalize(tolerance=0.001)

Let's compare this to the basic Douglas-Peucker result:

In [ ]:

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(19,4))
tdtr_generalized.plot(ax=axes[0], column='speed', vmax=20, **plot_defaults)
dp_generalized.plot(ax=axes[1], column='speed', vmax=20, **plot_defaults)

Cleaning trajectories by removing outliers¶

We use the speed-based OutlierCleaner that cuts away spikes in the trajectory when the speed exceeds the provided speed threshold value. If no speed threshold is provided, p95*alpha is used (where p95 is the 95th percentile of speed values and alpha is 3).

In [ ]:

cleaned = mpd.OutlierCleaner(split).clean(alpha=2) # .clean(v_max=100, units=("km", "h"))
cleaned.add_speed(units=("km", "h"), overwrite=True)
cleaned

In [ ]:

hvplot_defaults = {'tiles':'CartoLight', 'frame_height':400, 'frame_width':500, 'cmap':'Viridis', 'colorbar':True}
( split.trajectories[10].hvplot(title='Trajectory Cleaning - Before & After', label='Before', color='red', line_width=4, **hvplot_defaults) * 
  cleaned.trajectories[10].hvplot(label='After', c='speed', tiles=None, line_width=4 ) )

Smoothing trajectories¶

We use the KalmanSmootherCV trajectory smoother to smooth all trajectories based on the assumption of a nearly-constant velocity (CV) model. The process_noise_std and measurement_noise_std parameters can be used to tune the smoother:

process_noise_std governs the uncertainty associated with the adherence of the new (smooth) trajectories to the CV model assumption; higher values relax the assumption, therefore leading to less-smooth trajectories, and vice-versa.
measurement_noise_std controls the assumed error in the original trajectories; higher values dictate that the original trajectories are expected to be noisier (and therefore, less reliable), thus leading to smoother trajectories, and vice-versa.

Try tuning these parameters and observe the resulting trajectories:

In [ ]:

smooth = mpd.KalmanSmootherCV(split).smooth(process_noise_std=0.1, measurement_noise_std=10)
print(smooth)

Let's visually compare the smooth and original (split) trajectories.

In [ ]:

hvplot_defaults = {'tiles':'CartoLight', 'frame_height':320, 'frame_width':320, 'cmap':'Viridis', 'colorbar':True}
kwargs = {**hvplot_defaults, 'line_width':4}
( split.trajectories[9].hvplot(title='Original Trajectories', **kwargs) + 
  smooth.trajectories[9].hvplot(title='Smooth Trajectories', **kwargs))

And finally, let's compare the calculated speeds:

In [ ]:

kwargs = {**hvplot_defaults, 'c':'speed', 'line_width':7, 'clim':(0,2)}
smooth.add_speed(overwrite=True)
(split.trajectories[9].hvplot(title='Original Trajectories', **kwargs) + 
 smooth.trajectories[9].hvplot(title='Smooth Trajectories', **kwargs))

As expected, the smooth trajectories are significantly less rugged (i.e. noisy) and exhibit smoother velocity transitions.

Getting started with MovingPandas¶

Creating a trajectory from scratch¶

Processing trajectories¶

Visualizing trajectories¶

Analyzing trajectories¶

Extracting a moving object's position was at a certain time¶

Measuring distances to other objects¶

Extracting trajectory segments based on time or geometry (i.e. clipping)¶

Beyond toy trajectories: Loading trajectory data from GeoPackage¶

Creating trajectories with TrajectoryCollection¶

Converting TrajectoryCollections back to GeoDataFrames¶

Visualizing real-world trajectories¶

Trajectory manipulation and handling¶

Finding intersections with a Shapely polygon¶

Splitting trajectories¶

Generalizing trajectories¶

Spatial generalizers¶

Temporal generalizers¶

Spatiotemporal generalizers¶

Cleaning trajectories by removing outliers¶

Smoothing trajectories¶

Continue exploring MovingPandas¶