Basic Agent Sudy

It is recommended to have a look at the 0_basic_functionalities, 1_Observation_Agents, 2_Action_GridManipulation and 3_TrainingAnAgent notebooks before getting into this one.


In this notebook we will show how to study an Agent. We will use a dummy agent and then look at how to study its behaviour from the saved file.

This notebook will also show you how to use the Graphical User Interface built for analyzing grid2Op agents, called "Grid2Viz".

It is more than recommended to know how to define an Agent and use a Runner before doing this tutorial!

Evaluate the performance of a simple Agent

In [1]:
import os
import sys
import grid2op
import copy
import numpy as np
import shutil
import plotly.graph_objects as go

from tqdm.notebook import tqdm
from grid2op.Agent import PowerLineSwitch
from grid2op.Reward import L2RPNReward
from grid2op.Runner import Runner
from grid2op.Chronics import GridStateFromFileWithForecasts, Multifolder
path_agents = "study_agent_getting_started"
max_iter = 30

In the next cell we evaluate the agent "PowerLineSwitch" and save the results of this evaluation in the "study_agent_getting_started" directory.

In [2]:
scoring_function = L2RPNReward
env = grid2op.make(reward_class=L2RPNReward, test=True)
# env.chronics_handler.set_max_iter(max_iter)
shutil.rmtree(os.path.abspath(path_agents), ignore_errors=True)
if not os.path.exists(path_agents):

# make a runner for this agent
path_agent = os.path.join(path_agents, "PowerLineSwitch")
shutil.rmtree(os.path.abspath(path_agent), ignore_errors=True)

runner = Runner(**env.get_params_for_runner(),
res =, nb_episode=2, 
print("The results for the evaluated agent are:")
for _, chron_id, cum_reward, nb_time_step, max_ts in res:
    msg_tmp = "\tFor chronics with id {}\n".format(chron_id)
    msg_tmp += "\t\t - cumulative reward: {:.6f}\n".format(cum_reward)
    msg_tmp += "\t\t - number of time steps completed: {:.0f} / {:.0f}".format(nb_time_step, max_ts)
/home/benjamin/Documents/grid2op_dev/getting_started/grid2op/MakeEnv/ UserWarning:

You are using a development environment. This environment is not intended for training agents.

The results for the evaluated agent are:
	For chronics with id 000
		 - cumulative reward: 496.772949
		 - number of time steps completed: 30 / 30
	For chronics with id 001
		 - cumulative reward: 515.620117
		 - number of time steps completed: 30 / 30

Looking at the results and understanding the behaviour of the Agent

The content of the folder is the following:

In [3]:

Now we can load the actions and observations corresponding to the episode 1 for example, and de-serialize them into proper objects: This is now automatically done with the class EpisodeData that can be used as follow:

In [4]:
from grid2op.Episode import EpisodeData
episode_studied = "001"
this_episode = EpisodeData.from_disk(path_agent, episode_studied)

Inspect the actions

Now we can study the agent. For example, let's inspect its actions and see how many powerlines it has disconnected (this is probably not the best thing to do here).

In [5]:
line_disc = 0
line_reco = 0
line_changed = 0
for act in this_episode.actions:
    dict_ = act.as_dict()
    if "set_line_status" in dict_:
        line_reco += dict_["set_line_status"]["nb_connected"]
        line_disc += dict_["set_line_status"]["nb_disconnected"]
    if "change_line_status" in dict_:
        line_changed += dict_["change_line_status"]["nb_changed"]
print(f'Total lines set to connected : {line_reco}')
print(f'Total lines set to disconnected : {line_disc}')
print(f'Total lines changed: {line_changed}')
Total lines set to connected : 0
Total lines set to disconnected : 0
Total lines changed: 3

We can also wonder how many times this agent acted on the powerline with id $14$, and inspect how many times it changed its status:

In [6]:
id_line_inspected = 13
actions_on_line_14 = 0
for act in this_episode.actions:
    dict_ = act.effect_on(line_id=id_line_inspected) # which effect has this action action on the substation with given id
    # other objects are: load_id, gen_id, line_id or substation_id
    if dict_['change_line_status'] or dict_["set_line_status"] != 0:
        actions_on_line_14 += 1
print(f'Total actions on powerline 14 : {actions_on_line_14}')
Total actions on powerline 14 : 1

Inspect the modifications of the environment

For example, we might want to inspect the number of hazards and maintenances in a total scenario, to have an idea of how difficult it was.

In [7]:
nb_hazards = 0
nb_maintenance = 0
for act in this_episode.env_actions:
    dict_ = act.as_dict() # representation of an action as a dictionnary, see the documentation for more information
    if "nb_hazards" in dict_:
        nb_hazards += 1
    if "nb_maintenance" in dict_:
        nb_maintenance += 1
print(f'Total hazards : {nb_hazards}')
print(f'Total maintenances : {nb_maintenance}')
Total hazards : 0
Total maintenances : 0

Inspect the observations

For example, let's look at the consumption of load 1. For this cell to work, plotly is required.

In [8]:
import plotly.graph_objects as go
load_id = 1
# extract the data
val_load1 = np.zeros(len(this_episode.observations))
for i, obs in enumerate(this_episode.observations):
    dict_ = obs.state_of(load_id=load_id) # which effect has this action action on the substation with id 1
    # other objects are: load_id, gen_id, line_id or substation_id
    # see the documentation for more information.
    val_load1[i] = dict_['p']

# plot it
fig = go.Figure(data=[go.Scatter(x=[i for i in range(len(val_load1))],
fig.update_layout(title="Consumption of load {}".format(load_id),
                 xaxis_title="Time step",
                 yaxis_title="Load (MW)")