Exploring your Home Assistant data

The goal of this page is to get you familiar with the data in your Home Assistant instance. The page you're reading right now is a Jupyter Notebook. These documents contain instructions for the user and embedded Python code to generate graphs and tables of your data. It's interactive so you can at any time change the code of any example and just press the ▶️ button to update the example with your changes!

To get started, let's execute all examples on this page: in the menu at the top left, click on "Run" -> "Run All Cells".

In [ ]:
#!pip install HASS-data-detective # Install detective
In [ ]:
!pip show HASS-data-detective
In [ ]:
import detective.core as detective
import detective.functions as functions
import pandas as pd

db = detective.db_from_hass_config()

In the following example, we're going to explore your most popular entities and break it down per period of the day (morning/afternoon/evening/night).

We will do this by looking at which services are getting called and which entities they targeted. To make the results more relevant, we will filter out any service call that happened because of another service call. So if a user turns on a script which turns on a light, we only count the interaction with the script and not with the light.

In [ ]:
from collections import Counter, OrderedDict
import json

from detective.time import time_category, sqlalch_datetime, localize, TIME_CATEGORIES

# Prepare a dictionary to track results
results = OrderedDict((time_cat, Counter()) for time_cat in TIME_CATEGORIES)

# We keep track of contexts that we processed so that we will only process
# the first service call in a context, and not subsequent calls.
context_processed = set()

for event in db.perform_query("SELECT * FROM events WHERE event_type = 'call_service' ORDER BY time_fired"):
    entity_ids = None

    # Skip if we have already processed an event that was part of this context
    if event.context_id in context_processed:
        continue

    try:
        event_data = json.loads(event.event_data)
    except ValueError:
        continue

    # Empty event data, skipping (shouldn't happen, but to be safe)
    if not event_data:
        continue

    service_data = event_data.get('service_data')

    # No service data found, skipping
    if not service_data:
        continue

    entity_ids = service_data.get('entity_id')

    # No entitiy IDs found, skip this event
    if entity_ids is None:
        continue

    if not isinstance(entity_ids, list):
        entity_ids = [entity_ids]

    context_processed.add(event.context_id)

    period = time_category(
        localize(sqlalch_datetime(event.time_fired)))

    for entity_id in entity_ids:
        results[period][entity_id] += 1

print("Most popular entities to interact with:")

RESULTS_TO_SHOW = 5

for period, period_results in results.items():
    print()
    
    entities = [
        ent_id for (ent_id, count)
        in period_results.most_common(RESULTS_TO_SHOW)
    ]
    
    result = ', '.join(entities) if entities else '-'
    print(f"{period.capitalize()}: {result}")

Next up

Let's now use pandas to visualise the results.

In [ ]:
df = pd.DataFrame.from_dict(results).fillna(0)
df

View states

Detective makes it easy to view your state data as a pandas dataframe.

In [ ]:
%%time

df = db.fetch_all_sensor_data()

Our data is now in a Pandas dataframe. Lets show the head of the dataframe:

In [ ]:
df.head()

It is necessary to do some formatting of the data before we can plot it, and detective provides several functions to assist. You should familiarise yourself with these functions and create your own.

In [ ]:
df = df[df['domain']=='sensor']
df = functions.generate_features(df)
df = functions.format_dataframe(df)
In [ ]:
df.head()

Notice the new feature columns added. It is straightforward to create your own features, for example to add a day_of_week column

In [ ]:
df['day_of_week'] = df['last_changed'].apply(lambda x : x.dayofweek)
In [ ]:
df.head()

Plot some data

First plot using Seaborn

In [ ]:
#!pip install seaborn # Uncomment to install if required
In [ ]:
import seaborn as sns
import matplotlib.pyplot as plt

fig, ax = plt.subplots(1, figsize=(20,6))
sns.lineplot(
    x='last_changed', 
    y='state', 
    hue='entity_id', 
    data=df[df['device_class'] == 'temperature'], 
    ax=ax);
In [ ]: