All data in pandapower is stored in pandas dataframes, which is why all functionalities that are implemented in pandas (http://pandas.pydata.org/) can be used with pandapower. pandas is a very powerful and widely used tool for data analysis.
In the following we will present just a few of the many possibilities how builtin pandas functions can make your life easier when working in pandapower.
This tutorial refers to pandas functionalities and how to use them with pandapower. If you have not worked with pandas before, this should give you a good overview of the relevant functionality. If you are already familiar with pandas, it might still refresh your memory, but maybe not provide new information.
We use the mv_oberrhein example from the pandapower networks package for this tutorial.
#Import the pandapower and the networks module:
import pandapower as pp
import pandapower.networks as nw
import pandas as pd
#Import an example network:
net = nw.mv_oberrhein()
You can accesss all parameters of one element with the .loc command:
#Detailed information about line 4
net.line.loc[4]
name Line 4 std_type NA2XS2Y 1x185 RM/25 12/20 kV from_bus 239 to_bus 236 length_km 0.381968 r_ohm_per_km 0.161 x_ohm_per_km 0.117 c_nf_per_km 273.0 g_us_per_km 0.0 max_i_ka 0.362 df 1.0 parallel 1 type cs in_service True Name: 4, dtype: object
Or one parameter of multiple elements:
#Length information about line 5 and 6
net.line.length_km.loc[[5,6]]
5 3.3028 6 0.3504 Name: length_km, dtype: float64
The values are returned as a pandas series. To get them in an array instead, use the values function:
#Length information about line 5 and 6 as numpy array
net.line.length_km.loc[[5,6]].values
array([3.3028, 0.3504])
Multiple parameters of multiple elements are returned as a pandas dataframe:
#Length and type information about line 5 and 6
net.line.loc[[5,6], ["length_km", "r_ohm_per_km", "x_ohm_per_km", "c_nf_per_km"]]
length_km | r_ohm_per_km | x_ohm_per_km | c_nf_per_km | |
---|---|---|---|---|
5 | 3.3028 | 0.161 | 0.117 | 273.0 |
6 | 0.3504 | 0.161 | 0.117 | 273.0 |
The values function in this case yields a multi-dimensional array:
#Length and type information about line 5 and 6 as an array
net.line.loc[[5,6], ["length_km", "r_ohm_per_km", "x_ohm_per_km", "c_nf_per_km"]].values
array([[3.3028e+00, 1.6100e-01, 1.1700e-01, 2.7300e+02], [3.5040e-01, 1.6100e-01, 1.1700e-01, 2.7300e+02]])
To access exactly one parameter, .at can be used instead of .loc:
net.line.length_km.at[6]
0.3504
This .at solution can only be used for exactly one parameter, but is siginifcantly faster than using .loc. That is why you should always use .at if possible!
Setting parameters works the same way:
#Set single parameter with .at
net.line.length_km.at[5] = 3.2
#or
net.line.at[5, "length_km"] = 3.2
#Set multiple parameters with .loc
net.line.length_km.loc[[4,6]] = [1.8, 2.2]
#Output
print("Line lengths after parameter setting:")
net.line.length_km.loc[[4,5,6]]
Line lengths after parameter setting:
4 1.8 5 3.2 6 2.2 Name: length_km, dtype: float64
You can easily find maximum, minimum values of a column with pandas, e.g.
print("Maximum Line Length: %.2f km"%net.line.length_km.max())
print("Minimum Line Length: %.2f km"%net.line.length_km.min())
print("Mean Line Length: %.2f km"%net.line.length_km.mean())
Maximum Line Length: 3.20 km Minimum Line Length: 0.07 km Mean Line Length: 0.62 km
Of course you can also combine these, for example to calculate weighted means:
weighted_mean_r = (net.line.length_km * net.line.r_ohm_per_km).sum() / net.line.length_km.sum()
print("Weighted Mean Resistance: %.2f Ohm per kilometer"%weighted_mean_r)
Weighted Mean Resistance: 0.16 Ohm per kilometer
If you want to iterate over a dataframe, use the iterrows function. It gives you the index of each row and the row as a pandas series.
for lidx, linerow in net.line.loc[[4,5,6]].iterrows():
print("the line with index %s is %f kilometers long and has a resistance of %f ohm per kilometers" %(lidx, linerow.length_km, linerow.r_ohm_per_km))
the line with index 4 is 1.800000 kilometers long and has a resistance of 0.161000 ohm per kilometers the line with index 5 is 3.200000 kilometers long and has a resistance of 0.161000 ohm per kilometers the line with index 6 is 2.200000 kilometers long and has a resistance of 0.161000 ohm per kilometers
You can also group elements that share the same value in any column of the dataframe with the groubpy function of pandas;
for std_type, linetable in net.line.groupby(net.line.std_type):
print("there are %u lines with standard type %s with an overall length of %f kilometers"%(len(linetable), std_type, linetable.length_km.sum()))
there are 10 lines with standard type 243-AL1/39-ST1A 20.0 with an overall length of 10.761400 kilometers there are 160 lines with standard type NA2XS2Y 1x185 RM/25 12/20 kV with an overall length of 96.806343 kilometers there are 11 lines with standard type NA2XS2Y 1x240 RM/25 12/20 kV with an overall length of 4.343042 kilometers
The first variable is now the value of the column you grouped by (here line standard type) and the second variable is a pandas dataframe of all lines that have this value (here all lines with the standard type defined in the first variable).
You can also directly sum up certain values of a groupby, for example
net.line.groupby(net.line.std_type).sum().length_km
std_type 243-AL1/39-ST1A 20.0 10.761400 NA2XS2Y 1x185 RM/25 12/20 kV 96.806343 NA2XS2Y 1x240 RM/25 12/20 kV 4.343042 Name: length_km, dtype: float64
gives you a pandas of series with the sum of all line length grouped by standard type.
You will often need to select elements with specific characteristics. You can do that with boolean masks:
mask = net.line.length_km > 2.5
the mask variable is now a boolean time series that indicates for every line if it longer than 2500 meters or not. We can now select only the ones which are longer than 2500 meters and output their name and length:
long_lines = net.line[mask]
#Output
long_lines[["name", "length_km", "std_type"]]
name | length_km | std_type | |
---|---|---|---|
5 | Line 5 | 3.200000 | NA2XS2Y 1x185 RM/25 12/20 kV |
45 | Line 45 | 2.611111 | NA2XS2Y 1x185 RM/25 12/20 kV |
59 | Line 59 | 3.155390 | NA2XS2Y 1x185 RM/25 12/20 kV |
91 | Line 91 | 2.528200 | NA2XS2Y 1x185 RM/25 12/20 kV |
162 | Line 162 | 2.595300 | 243-AL1/39-ST1A 20.0 |
or, directly in one step:
net.line[net.line.length_km > 2.5][["name", "length_km", "std_type"]]
name | length_km | std_type | |
---|---|---|---|
5 | Line 5 | 3.200000 | NA2XS2Y 1x185 RM/25 12/20 kV |
45 | Line 45 | 2.611111 | NA2XS2Y 1x185 RM/25 12/20 kV |
59 | Line 59 | 3.155390 | NA2XS2Y 1x185 RM/25 12/20 kV |
91 | Line 91 | 2.528200 | NA2XS2Y 1x185 RM/25 12/20 kV |
162 | Line 162 | 2.595300 | 243-AL1/39-ST1A 20.0 |
You can of course also check for exact values, for example if you want all 110 kV buses:
net.bus[net.bus.vn_kv == 110]
name | vn_kv | type | zone | in_service | |
---|---|---|---|---|---|
58 | Bus 38 | 110.0 | n | None | True |
318 | Bus 177 | 110.0 | b | None | True |
For string querys, use the str.contains function, for example to find loads with "MV" in the name:
net.load[net.load.name.str.startswith("MV")]
name | bus | p_mw | q_mvar | const_z_percent | const_i_percent | sn_mva | scaling | in_service | type | |
---|---|---|---|---|---|---|---|---|---|---|
141 | MV Load 0 | 215 | 0.5 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
142 | MV Load 1 | 6 | 0.5 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
143 | MV Load 2 | 48 | 0.5 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
144 | MV Load 3 | 52 | 0.5 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
145 | MV Load 4 | 55 | 0.5 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
146 | MV Load 5 | 235 | 0.5 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
or to find all all loads that contain "Load 0" in their name:
net.load[net.load.name.str.contains("Load 0")]
name | bus | p_mw | q_mvar | const_z_percent | const_i_percent | sn_mva | scaling | in_service | type | |
---|---|---|---|---|---|---|---|---|---|---|
0 | LV Load 0 | 103 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
141 | MV Load 0 | 215 | 0.50 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
The isin function allows you to check if certain values are contained in a list, for example:
lines = net.line[net.line.from_bus.isin([39,72])]
lines[["name", "from_bus", "to_bus"]]
name | from_bus | to_bus | |
---|---|---|---|
165 | Line 165 | 39 | 86 |
returns the names of all lines that are connected to one of the buses 39 or 72 at the from bus.
You can also combine certain queries with the logical | (or) and & (and) for more complicated queries, for example:
lines = net.line[(net.line.from_bus.isin([39,72])) | (net.line.to_bus.isin([39,72]))]
lines[["name", "from_bus", "to_bus"]]
name | from_bus | to_bus | |
---|---|---|---|
36 | Line 36 | 289 | 72 |
54 | Line 54 | 30 | 72 |
162 | Line 162 | 80 | 39 |
165 | Line 165 | 39 | 86 |
returns the names of all lines that are connected to one of the buses 39 or 72 at the from bus or to bus.
Please be aware that the boolean query always returns a copy of the dataframe. Lets say you want to set all medium voltage nodes out of service and try this:
net.bus[net.bus.vn_kv > 25].in_service = False
You will notice that your pandapower network table did not change:
net.bus.in_service.head()
0 True 1 True 2 True 3 True 4 True Name: in_service, dtype: bool
This is because the query returns a copy and you change the value in this copy in the same way like the following code:
bus_table = net.bus[net.bus.vn_kv < 25]
bus_table.in_service = False
#Output
print(bus_table.in_service.head())
# but still:
print(net.bus.in_service.head())
0 False 1 False 2 False 3 False 4 False Name: in_service, dtype: bool 0 True 1 True 2 True 3 True 4 True Name: in_service, dtype: bool
Here bus_table is a copy of a part of the net.bus table. In the example above, you changed that copy, but not the original table. That is why you need to save the index of the copied bus table and than change the value of the original table:
#Get index and store in ns_nodes
ns_nodes = net.bus[net.bus.vn_kv < 25].index
#Change all values where index is in ns_nodes
net.bus.in_service.loc[ns_nodes] = False
#Output
print(net.bus.in_service.head())
0 False 1 False 2 False 3 False 4 False Name: in_service, dtype: bool
One of the benefits of pandapower is its easy to customize and extend datastructure. For pandapower to be able to run a loadflow, the standard parameters of the datastructure have to be defined. However, you can extend the pandapower datastructure any way you like.
Say you have network that is partitioned in three zones A, B and C:
nr_buses = len(net.bus.index)
net.bus.loc[net.bus.index[:nr_buses], "zone"] = "A"
net.bus.loc[net.bus.index[nr_buses:], "zone"] = "B"
net.bus.zone.head()
0 A 1 A 2 A 3 A 4 A Name: zone, dtype: object
The goal is to apply a load scaling factor of 0.8 in Zone A and of 0.6 in Zone B:
for i, load in net.load.iterrows():
if net.bus.zone.at[load.bus] == "A":
net.load.scaling.at[i] = 0.8
elif net.bus.zone.at[load.bus] == "B":
net.load.scaling.at[i] = 0.6
net.load.head(n=5)
name | bus | p_mw | q_mvar | const_z_percent | const_i_percent | sn_mva | scaling | in_service | type | |
---|---|---|---|---|---|---|---|---|---|---|
0 | LV Load 0 | 103 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station |
1 | LV Load 1 | 174 | 0.63 | 0.127927 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station |
2 | LV Load 2 | 194 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station |
3 | LV Load 3 | 34 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station |
4 | LV Load 4 | 76 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station |
However, you could also extend the line table to include the zone of each line like this:
net.scaling = 1.0 #reset
net.load["zone"] = net.bus.zone.loc[net.load.bus.values].values
net.load.head(n=5)
name | bus | p_mw | q_mvar | const_z_percent | const_i_percent | sn_mva | scaling | in_service | type | zone | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | LV Load 0 | 103 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
1 | LV Load 1 | 174 | 0.63 | 0.127927 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
2 | LV Load 2 | 194 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
3 | LV Load 3 | 34 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
4 | LV Load 4 | 76 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
and then apply your function only to the lines in Zone A:
net.load.loc[net.load[net.load.zone=="A"].index, "scaling"] = 0.8
net.load.loc[net.load[net.load.zone=="B"].index, "scaling"] = 0.6
net.load.head(n=5)
name | bus | p_mw | q_mvar | const_z_percent | const_i_percent | sn_mva | scaling | in_service | type | zone | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | LV Load 0 | 103 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
1 | LV Load 1 | 174 | 0.63 | 0.127927 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
2 | LV Load 2 | 194 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
3 | LV Load 3 | 34 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
4 | LV Load 4 | 76 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.8 | True | MV/LV Station | A |
Of course for this simple example, the first solution would also be feasible. But the more complicated your code gets, the more important it is to customize the pandapower framework, so your code is more efficient as well as easier to read and debug.
Lets say you want to see the voltage at each bus together with the bus name. The voltage is stored in net.res_bus, whereas the name ist stored in net.bus. The two values are in different tables, but they have the same index, which is why you can easily merge them into a new shared dataframe:
net = nw.mv_oberrhein()
pp.runpp(net) #Run a load flow calculation to get result tables
#Store bus name with bus voltage result of loadflow calculation in bus_results
#axis={horizontal=0, vertical=1}: axis to concatenate along
bus_results = pd.concat([net.bus.name, net.res_bus.vm_pu], axis=1)
#Output
bus_results.head(n=8)
name | vm_pu | |
---|---|---|
0 | Bus 0 | 1.009473 |
1 | Bus 1 | 1.009354 |
2 | Bus 2 | 1.010306 |
3 | Bus 3 | 1.008611 |
4 | Bus 4 | 1.010021 |
5 | Bus 5 | 1.015559 |
6 | Bus 6 | 1.023551 |
7 | Bus 7 | 1.015590 |
Now bus results shows you the name and the voltage of each bus in one dataframe.
If you want to see the bus voltage of all loads, you can also merge the bus and the load dataframe. They do not share an index, but rather the index of the bus is given in the “bus” column of the load table. In that case you can merge the dataframes like this: (for more information about the arguments see [pandas merging documentation] [pandas_merg]) [pandas_merg]: http://pandas.pydata.org/pandas-docs/stable/merging.html
load_bus_results = pd.merge(net.res_bus, net.load, left_index=True, right_on="bus")
#Output
load_bus_results.head(n=8)
vm_pu | va_degree | p_mw_x | q_mvar_x | name | bus | p_mw_y | q_mvar_y | const_z_percent | const_i_percent | sn_mva | scaling | in_service | type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
67 | 1.009473 | -6.778921 | 0.24 | 0.048734 | LV Load 67 | 0 | 0.40 | 0.081223 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
17 | 1.009354 | -6.783526 | 0.15 | 0.030459 | LV Load 17 | 1 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
128 | 1.010306 | -6.753059 | 0.15 | 0.030459 | LV Load 128 | 2 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
73 | 1.008611 | -6.807492 | 0.15 | 0.030459 | LV Load 73 | 3 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
53 | 1.010021 | -6.761493 | 0.15 | 0.030459 | LV Load 53 | 4 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
61 | 1.015559 | -6.321147 | 0.15 | 0.030459 | LV Load 61 | 5 | 0.25 | 0.050765 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
142 | 1.023551 | -5.478506 | 0.30 | 0.060918 | MV Load 1 | 6 | 0.50 | 0.101529 | 0.0 | 0.0 | NaN | 0.6 | True | MV Load |
101 | 1.008905 | -6.797072 | 0.24 | 0.048734 | LV Load 101 | 8 | 0.40 | 0.081223 | 0.0 | 0.0 | NaN | 0.6 | True | MV/LV Station |
Here net.res_bus is the left and net.load is the right dataframe. To merge the dataframe into one, the index of the left dataframe (left_index=True) and the bus column of the right dataframe (right_on=”bus”) are used. The resulting dataframe shows you the load table with the bus results of the respective load buses.
If you want to know the maximum voltage at a bus that has a load connected to it, you could use:
max_load_voltage = pd.merge(net.res_bus, net.load, left_index=True, right_on="bus").vm_pu.max()
#Output
max_load_voltage
1.0235513513298844
or alternatively with a boolean mask:
max_load_voltage = net.res_bus[net.bus.index.isin(net.load.bus.values)].vm_pu.max()
#Output
max_load_voltage
1.0235513513298844
In this simple case, the boolean mask solution is about 5 times faster than merging the dataframes, so only use merging for more complicated operations or if runtime is not important (e.g. debugging).