In this notebook we will expose what kind of problems grid2op aims at modeling.
We will first expose the "powergrid control" problem modeled by grid2op (there exist thousands of example of powergrid grid problems, we don't model them all, of course...)
And in the second part we will expose the basics of reinforcement learning.
A powergrid is a really complex man made "objects". In the case of grid2op, a powergrid can be represented as a set of "objects" that are:
All these objects are connected together in some places called "substation". In reality a substations looks like this: (image credit https://commons.wikimedia.org/wiki/File:Taipei_Taiwan_Electrical-Substation-at-Taipei-Zoo-South-Station-01.jpg )
And substations are connected together with so called "powerline" that allow some power to flow from one place to another: (image credit https://commons.wikimedia.org/wiki/File:Powerline-separators.jpg)
Of course, in this framework, we will not detail every objects you can see on these pictures. For us a powergrid (here relatively simple) will be represented as:
In the above figure you can see:
Now let's zoom in (or at least pretend to) in substation with id 2: As you can see on the right part (bottom) in fact this substations (as all substations in grid2op) have 2 "busbars" (represented here are vertical black line in the zoomed in part). Note that in all these graphs, powerlines are represented with blue lines.
What does it entails ? It entails that we can "split" a substation into different independant "buses". A "bus" is a fancy word from the power system community meaning that: if two objects are on the same bus, then there exist a direct electrical path in the substation connecting them. But wait, how can you split buses ?
In fact, for each "object" on a substation, in our modeling you have two switches as showed in the image below:
And you can choose to either connect the object (in the case the powerline with id 1) to either busbars (denoted by "a" and "b" in the figure).
So basically, all the goal of the grid2op platform is to be able to model in particular what is the "best" position of these switches. In all grid2op framework, when we speak about the *topology* you can imagine it being the list of the position of each switch in the grid.
We want to emphasize that this modeling is an over simplification of the reality.
In reality there can also be "switches" that can connect the two busbars (reconfiguring the topology of the substation can be done with only one switch, but on the other hand, sometimes changing one switch will have no effect at all).
You can also have more than 2 busbars in each substation (sometimes 5 or 6 for example). This makes the number of possible topologies even higher than what it is in grid2op.
Finally, most of the time a single busbar count a "switch" in its middle that allows to disconnect part of the element connected to it to another part. Basically this entails that some combinaison of elements are not possible to perform
And of course, we model explicitly in this framework (eg we allow the agents to act on) only some elements of a powergrid. In reality, much more heterogeneous objects exists with more complex properties.
We decided to make all these assumptions because we thought it was the easiest setting that allow to perform some topological reconfiguration, beside connecting / disconnecting powerlines.
Though the Grid2Op
package can be used to perform many different tasks, this set of notebooks will be focused on the machine learning part, and its usage in a Reinforcement learning framework.
Reinforcement learning is a framework that allows to train an "agent" to solve some task in a time-dependant domain. We tried to cast the grid operation planning task into this framework. The package Grid2Op
was inspired by it.
In reinforcement learning (RL), there are 2 distinct entities:
These 2 entities exchange 3 main types of information:
A schematic representaiton of this is shown in the figure below (Credit: Sutton & Barto):
For more information about the problem, please visit the Example_5bus notebook which dives more into the modelization of the real-time grid operation planning into a RL framework. Note that this notebook is still under development at the moment.
Some good materials are also provided in the white paper Reinforcement Learning for Electricity Network Operation presented for the L2RPN 2020 Neurips edition.