This module contains a class and functions that wraps the folium
package to plot geo-location data.
Read the MSTICPy Folium documentation
Read the Folium Python package documentation
You must have msticpy installed to run this notebook:
%pip install --upgrade msticpy
# Imports
import msticpy
msticpy.init_notebook(verbosity=0)
from msticpy.vis.foliummap import FoliumMap
You can use Folium via a pandas accessor, the plot_map function or directly interacting with our FoliumMap class.
MSTICPy uses pandas accessors to expose a lot of its visualization functions.
Plotting with Folium can be done directly from a pandas
DataFrame using the mp_plot.folium_map
extension.
This function returns an instance of the FoliumMap
class that you can further customize (see FoliumMap class later in the document).
# read in a DataFrame from a csv file
geo_loc_df = (
pd
.read_csv("data/ip_locs.csv", index_col=0)
.dropna(subset=["Latitude", "Longitude", "IpAddress"]) # We need to remove an NaN values
)
display(geo_loc_df.head(5))
geo_loc_df.mp_plot.folium_map(ip_column="IpAddress")
AllExtIPs | CountryCode | CountryName | State | City | Longitude | Latitude | Asn | edges | Type | AdditionalData | IpAddress | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 65.55.44.109 | US | United States | Virginia | Boydton | -78.3750 | 36.6534 | NaN | set() | geolocation | {} | 65.55.44.109 |
1 | 13.71.172.128 | CA | Canada | Ontario | Toronto | -79.4195 | 43.6644 | NaN | set() | geolocation | {} | 13.71.172.128 |
2 | 13.71.172.130 | CA | Canada | Ontario | Toronto | -79.4195 | 43.6644 | NaN | set() | geolocation | {} | 13.71.172.130 |
3 | 40.124.45.19 | US | United States | Texas | San Antonio | -98.4926 | 29.4221 | NaN | set() | geolocation | {} | 40.124.45.19 |
4 | 104.43.212.12 | US | United States | Iowa | Des Moines | -93.6127 | 41.6015 | NaN | set() | geolocation | {} | 104.43.212.12 |
If you already have coordinates in the data you can use
these (rather than looking up the IP location again) using
the lat_column
and long_column
parameters.
geo_loc_df.mp_plot.folium_map(
lat_column="Latitude", long_column="Longitude", zoom_start=10
)
You can use the Folium layers feature, specifying a column
value on which to group each layer with the layer_column
parameter.
geo_loc_df.mp_plot.folium_map(
ip_column="IpAddress", layer_column="CountryName", zoom_start=2
)
You use DataFrame column values to populate the tooltip
and popup elements for each marker with the
tooltip_columns
and popup_columns
parameters.
# Create some data to display
data_df = pd.DataFrame({
"Status": ["Home", "Office", "Vacation"] * (len(geo_loc_df) // 3),
"Friendliness": ["Warm", "Cold", "Medium"] * (len(geo_loc_df) // 3),
"Flavor": ["Chocolate", "Cinnamon", "Mango"] * (len(geo_loc_df) // 3),
"SpiceLevel": [1, 2, 3] * (len(geo_loc_df) // 3)
})
geo_loc_data_df = pd.concat([geo_loc_df, data_df], axis=1).dropna(subset=["IpAddress"])
geo_loc_data_df.head(3)
AllExtIPs | CountryCode | CountryName | State | City | Longitude | Latitude | Asn | edges | Type | AdditionalData | IpAddress | Status | Friendliness | Flavor | SpiceLevel | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 65.55.44.109 | US | United States | Virginia | Boydton | -78.3750 | 36.6534 | NaN | set() | geolocation | {} | 65.55.44.109 | Home | Warm | Chocolate | 1.0 |
1 | 13.71.172.128 | CA | Canada | Ontario | Toronto | -79.4195 | 43.6644 | NaN | set() | geolocation | {} | 13.71.172.128 | Office | Cold | Cinnamon | 2.0 |
2 | 13.71.172.130 | CA | Canada | Ontario | Toronto | -79.4195 | 43.6644 | NaN | set() | geolocation | {} | 13.71.172.130 | Vacation | Medium | Mango | 3.0 |
geo_loc_data_df.mp_plot.folium_map(
ip_column="IpAddress",
layer_column="CountryName",
tooltip_columns=["Status", "Flavor"],
popup_columns=["Friendliness", "SpiceLevel", "Status", "Flavor"],
zoom_start=2,
)
You can also control the icons used for each marker with the
icon_column
parameters. If you happen to have a column in your
data that contains names of FontAwesome or GlyphIcons icons.
More typically you would combine the icon_column
with the
icon_map
parameter. You can specify either a dictionary or a
function. For a dictionary, the value of the row in icon_column
is used as a key - the value is a dictionary of icon parameters
passed to the Folium.Icon class. For a method, the icon_column
value is passed as a single parameter and the return value
should be a dictionary of valid parameters for the Icon
class.
You can read the documentation for this function in the doc.
If icon_map
is a dict it should contain keys that map to the
value of icon_col
and values that a dicts of valid
folium Icon properties ("color", "icon_color", "icon", "angle", "prefix").
The dict should include a "default" entry that will be used if the
value in the DataFrame[icon_col] doesn't match any key.
For example:
icon_map = {
"high": {
"color": "red",
"icon": "warning",
},
"medium": {
"color": "orange",
"icon": "triangle-exclamation",
"prefix": "fa",
},
"default": {
"color": "blue",
"icon": "info-sign",
},
}
If icon_map is a function it should take a single str parameter
(the item key) and return a dict of icon properties. It should
return a default set of values if the key does not match a known
key. The icon_col
value for each row will be passed to this
function and the return value used to populate the Icon arguments.
For example:
def icon_mapper(icon_key):
if icon_key.startswith("bad"):
return {
"color": "red",
"icon": "triangle-alert",
}
...
else:
return {
"color": "blue",
"icon": "info-sign",
}
Check out the possible names for icons:
icon_map = {
"US": {
"color": "green",
"icon": "flash",
},
"GB": {
"color": "purple",
"icon": "flash",
},
"default": {
"color": "blue",
"icon": "info-sign",
},
}
geo_loc_df.mp_plot.folium_map(
ip_column="AllExtIPs",
icon_column="CountryCode",
icon_map=icon_map,
zoom_start=2,
)
plot_map
function¶The plot_map
function is identical to the mp_plot.folium
map accessor. You can import this directly and use in place
of the pandas accessor.
from msticpy.vis.foliummap import plot_map
plot_map(
data=geo_loc_df,
ip_column="AllExtIPs",
icon_column="CountryCode",
icon_map=icon_map,
zoom_start=2,
)
Use the Folium Map class when you want to build up data clusters and layers incrementally.
It now supports multiple data types for entry:
map.add_ip_cluster
) and Geolocation (map.add_geoloc_cluster
) entitiesmap.add_ips
)map.add_locations
)map.add_geo_hashes
)You can also use other member functions to add layers and cluster groups.
FoliumMap(
title: str = 'layer1',
zoom_start: float = 2.5,
tiles=None,
width: str = '100%',
height: str = '100%',
location: list = None,
)
Wrapper class for Folium/Leaflet mapping.
Parameters
----------
title : str, optional
Name of the layer (the default is 'layer1')
zoom_start : int, optional
The zoom level of the map (the default is 7)
tiles : [type], optional
Custom set of tiles or tile URL (the default is None)
width : str, optional
Map display width (the default is '100%')
height : str, optional
Map display height (the default is '100%')
location : list, optional
Location to center map on
Attributes
----------
folium_map : folium.Map
folium_map = FoliumMap(location=(47.5982328,-122.331), zoom_start=14)
folium_map
The underlying folium map object is accessible as the folium_map
attribute
type(folium_map.folium_map)
folium.folium.Map
fol_map.add_ip_cluster(
ip_entities: Iterable[msticpy.datamodel.entities.IpAddress],
**kwargs,
)
import pickle
with open(b"data/ip_entities.pkl", "rb") as fh:
ip_entities = pickle.load(fh)
ip_entities = [ip for ip in ip_entities if ip.Location and ip.Location.Latitude]
folium_map = FoliumMap(zoom_start=9)
folium_map.add_ip_cluster(ip_entities=ip_entities, color='orange')
folium_map.center_map()
folium_map
ip_map.add_ips(ip_addresses: Iterable[str], **kwargs)
ips = geo_loc_df.query("State == 'California'").AllExtIPs.values
print("IP dataset", ips[:3], "...")
ip_map = FoliumMap(zoom_start="3")
ip_map.add_ips(ips)
ip_map.center_map()
ip_map
IP dataset ['13.83.149.5' '13.83.148.235' '13.64.188.245'] ...
ip_map.add_locations(locations: Iterable[Tuple[float, float]], **kwargs)
locations = geo_loc_df.query("CountryCode != 'US'").apply(lambda x: (x.Latitude, x.Longitude), axis=1).values
print("Location dataset", locations[:3], "...")
ip_map.add_locations(locations)
ip_map.center_map()
ip_map
Location dataset [(43.6644, -79.4195) (43.6644, -79.4195) (43.6644, -79.4195)] ...
# Create IP and GeoLocation Entities from the dataframe
def create_ip_entity(row):
ip_ent = IpAddress(Address=row["AllExtIPs"])
geo_loc = create_geo_entity(row)
ip_ent.Location = geo_loc
return ip_ent
def create_geo_entity(row):
# get subset of fields for GeoLocation
loc_props = row[["CountryCode", "CountryName","State", "City", "Longitude", "Latitude"]]
return GeoLocation(**loc_props.to_dict())
geo_locs = list(geo_loc_df.apply(create_geo_entity, axis=1).values)
ip_ents = list(geo_loc_df.apply(create_ip_entity, axis=1).values)
fmap_ips = FoliumMap()
fmap_ips.add_ip_cluster(ip_entities=ip_ents[:20], color='blue')
fmap_ips.center_map()
fmap_ips
fmap_ips.add_ip_cluster(ip_entities=ip_ents[30:40], color='red', icon="flash")
fmap_ips.center_map()
fmap_ips
By default folium uses the information icon (i). Icons can be taken from the default Bootstrap set. See the default list here glyphicons
Alternatively you can use icons from the Font Awesome collection by adding prefx="fa" and icon="icon_name" to the call to add_ip_cluster or add_geo_cluster.
fmap_ips.add_geoloc_cluster(
geo_locations=geo_locs[40:50],
color='darkblue',
icon="desktop",
prefix="fa"
)
fmap_ips.center_map()
fmap_ips
from msticpy.vis.foliummap import get_map_center, get_center_ip_entities, get_center_geo_locs
print(get_center_geo_locs(geo_locs))
print(get_center_geo_locs(geo_locs, mode="mean"))
# get_map_center Will accept iterable of any entity type that is either
# an IpAddress entity or an entity that has properties of type IpAddress
print(get_map_center(ip_ents[30:40]))
print(get_map_center(ip_ents[:20]))
print(get_center_ip_entities(ip_ents[:20]))
(38.7095, -79.4195) (40.01865529411764, -84.9259) (38.7095, -78.1539) (38.7095, -86.5161) (38.7095, -86.5161)
from msticpy.context.geoip import entity_distance
print("Distance between")
print(f"{ip_ents[0].Address} ({ip_ents[0].Location.City})")
print(f"{ip_ents[1].Address} ({ip_ents[1].Location.City})")
print(entity_distance(ip_ents[0], ip_ents[1]), "km", "\n")
print("Distance between")
print(f"{ip_ents[0].Address} ({ip_ents[0].Location.City})")
print(f"{ip_ents[13].Address} ({ip_ents[13].Location.City})")
print(entity_distance(ip_ents[0], ip_ents[13]), "km", "\n")
Distance between 65.55.44.109 (Boydton) 13.71.172.128 (Toronto) 784.604908273247 km Distance between 65.55.44.109 (Boydton) 131.107.147.209 (Redmond) 3751.614749340525 km