National generation capacity: Check notebook
This Notebook is part of the National Generation Capacity Datapackage of Open Power System Data.

1. Introductory notes

The notebook extends the processing notebook to make visualisations and perform consistency checks.

2. Script setup

In [ ]:
import pandas as pd
import numpy as np
import os.path
import logging

from bokeh.charts import Bar, output_file, show
from bokeh.io import output_notebook
from bokeh.models import HoverTool, NumeralTickFormatter
from bokeh.charts.attributes import color

output_notebook()

%matplotlib inline

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    datefmt='%d %b %Y %H:%M:%S'
    )

logger = logging.getLogger()

3. Import of processed data

The processed data is imported at this stage. This requires that the processing notebook is compiled prior to this step. Otherwise, no or an old dataset is imported.

In [ ]:
data_file = 'national_generation_capacity_stacked.csv'
filepath = os.path.join('output', data_file)
data = pd.read_csv(filepath, index_col=0)

data.head()

4. Visualisation of results for different energy source levels

In the following, national generation capacity is compared for the different energy source levels. Due to the size of the dataset regarding countries and compiled datasources, the following tables and figures tend to be rather confusing. Therefore, we provide the following option to limit the visualisation of the results to a selection of countries and years. If the corresponding subset is empty, all values will be considered.

In [ ]:
country_subset = ['DE', 'FR', 'BE', 'NL', 'IT']
year_subset = [2013, 2014, 2015, 2016]
In [ ]:
data_selection = pd.DataFrame()

if len(country_subset) == 0:
    data_selection = data
else:
    for country in country_subset:
        for year in year_subset:
            if len(data_selection) == 0:
                data_selection = data[(data.country == country) &
                                      (data.year == year)]
            else:
                data_selection = data_selection.append(data[(data.country == country) &
                                                            (data.year == year)])

data_selection

To improve the data visualisation in Bokeh the colors of the defined technologies is specified explicitly. The user is free to adjust or refine the color definition using the following parameter. The color names are defined here.

In [ ]:
colormap = {
    'Fossil fuels': 'Black',
    'Lignite': 'SaddleBrown',
    'Hard coal': 'Black',
    'Oil': 'Violet',
    'Natural gas': 'IndianRed',
    'Combined cycle': '#d57676',
    'Gas turbine': '#e19d9d',
    'Other and unknown natural gas': '#c33c3c',
    'Differently categorized natural gas': 'IndianRed',
    'Non-renewable waste': 'SandyBrown',
    'Mixed fossil fuels': 'LightGray',
    'Other fossil fuels': 'DarkGray',
    'Differently categorized fossil fuels': 'Gray',
    'Nuclear': 'Red',
    'Renewable energy sources': 'Green',
    'Hydro': 'Navy',
    'Run-of-river': '#0000b3',
    'Reservoir': '#0000e6',
    'Reservoir including pumped storage': '#0000e6',
    'Pumped storage': '#1a1aff',
    'Pumped storage with natural inflow': '#1a1aff',
    'Differently categorized hydro': 'Navy',
    'Wind': 'SkyBlue',
    'Onshore': 'LightSkyBlue',
    'Offshore': 'DeepSkyBlue',
    'Differently categorized wind': 'SkyBlue',
    'Solar': 'Yellow',
    'Photovoltaics': '#ffff33',
    'Concentrated solar power': '#ffff66',
    'Differently categorized solar': 'Yellow',
    'Geothermal': 'DarkRed',
    'Marine': 'Blue',
    'Bioenergy and renewable waste': 'Green',
    'Biomass and biogas': '#00b300',
    'Sewage and landfill gas': '#00e600',
    'Other bioenergy and renewable waste': 'Green',
    'Differently categorized renewable energy sources': 'Green',
    'Other or unspecified energy sources': 'Orange',
}

4.1 Energy source level 1

4.1.1 Table

In [ ]:
pivot_capacity_level1 = pd.pivot_table(data_selection[data_selection.energy_source_level_1 == True],
                                       index=('country','year','source'),
                                       columns='technology',
                                       values='capacity',
                                       aggfunc=sum,
                                       margins=False)

pivot_capacity_level1

4.1.2 Bokeh chart

Please use the zoom and hover option to inspect the data graphically.

In [ ]:
data_energy_level_1 = data_selection[data_selection.energy_source_level_1 == True].copy()
data_energy_level_1['color'] = 'White'
data_energy_level_1['color'] = data_energy_level_1['technology'].map(colormap)

bar = Bar(data_energy_level_1, values='capacity', label=['country', 'year', 'source'], stack='technology',
          title="National capacity by type of energy source", tools="pan,wheel_zoom,box_zoom,reset,hover,save",
          legend='top_right', 
          plot_width=1600, plot_height=800,
#          color=color(columns='technology', palette=['Black', 'Red', 'Green', 'Orange'], sort=False))
          color='color')
bar._yaxis.formatter = NumeralTickFormatter(format="00,000 MW")

hover = bar.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [("Country", "@country"),
                  ("Year", "@year"),
                  ("Source", "@source"),
                  ("Category", "@technology"),
                  ("Capacity", "@height{00,000.00} MW"),
]

show(bar)

4.2 Energy source level 2

4.2.1 Table

In [ ]:
pivot_capacity_level2 = pd.pivot_table(data_selection[data_selection.energy_source_level_2 == True],
                                       index=('country','year','source'),
                                       columns='technology',
                                       values='capacity',
                                       aggfunc=sum,
                                       margins=False)

pivot_capacity_level2

4.2.2 Bokeh chart

Please use the zoom and hover option to inspect the data graphically.

In [ ]:
data_energy_level_2 = data_selection[data_selection.energy_source_level_2 == True].copy()
data_energy_level_2['color'] = 'White'
data_energy_level_2['color'] = data_energy_level_2['technology'].map(colormap)

bar = Bar(data_energy_level_2, values='capacity', label=['country', 'year', 'source'], stack='technology',
          title="National capacity by energy source", tools="pan,wheel_zoom,box_zoom,reset,hover,save",
          legend='top_right', 
          plot_width=1600, plot_height=800,
          color='color'
         )
bar._yaxis.formatter = NumeralTickFormatter(format="00,000 MW")

hover = bar.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [("Country", "@country"),
                  ("Year", "@year"),
                  ("Source", "@source"),
                  ("Category", "@technology"),
                  ("Capacity", "@height{00,000.00} MW"),
]

show(bar)

4.3 Energy source level 3

4.3.1 Table

In [ ]:
pivot_capacity_level3 = pd.pivot_table(data_selection[data_selection.energy_source_level_3 == True],
                                       index=('country', 'year', 'source'),
                                       columns='technology',
                                       values='capacity',
                                       aggfunc=sum,
                                       margins=False)

pivot_capacity_level3

4.3.2 Bokeh chart

Please use the zoom and hover option to inspect the data graphically.

In [ ]:
data_energy_level_3 = data_selection[data_selection.energy_source_level_3 == True].copy()
data_energy_level_3['color'] = 'White'
data_energy_level_3['color'] = data_energy_level_3['technology'].map(colormap)

bar = Bar(data_energy_level_3, values='capacity', label=['country', 'year', 'source'], stack='technology',
          title="National capacity by energy source", tools="pan,wheel_zoom,box_zoom,reset,hover,save",
#          legend='top_right', 
          plot_width=1600, plot_height=800,
          color='color'
         )
bar._yaxis.formatter = NumeralTickFormatter(format="00,000 MW")

hover = bar.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [("Country", "@country"),
                  ("Year", "@year"),
                  ("Source", "@source"),
                  ("Category", "@technology"),
                  ("Capacity", "@height{00,000.00} MW"),
]

show(bar)

4.4 Technology level

4.4.1 Table

In [ ]:
pivot_capacity_techlevel = pd.pivot_table(data_selection[data_selection.technology_level == True],
                                          index=('country', 'year', 'source'),
                                          columns='technology',
                                          values='capacity',
                                          aggfunc=sum,
                                          margins=False)

pivot_capacity_techlevel

4.4.2 Bokeh chart

Please use the zoom and hover option to inspect the data graphically.

In [ ]:
data_technology_level = data_selection[data_selection.technology_level == True].copy()
data_technology_level['color'] = 'White'
data_technology_level['color'] = data_technology_level['technology'].map(colormap)

bar = Bar(data_technology_level, values='capacity', label=['country', 'year', 'source'], stack='technology',
          title="National capacity by energy source and technology", tools="pan,wheel_zoom,box_zoom,reset,hover,save",
#          legend='top_right', 
          plot_width=1600, plot_height=800,
          color='color'
         )
bar._yaxis.formatter = NumeralTickFormatter(format="00,000 MW")

hover = bar.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [("Country", "@country"),
                  ("Year", "@year"),
                  ("Source", "@source"),
                  ("Category", "@technology"),
                  ("Capacity", "@height{00,000.00} MW"),
]

show(bar)

5. Comparison of total capacity for energy source levels

In the following, the installed capacities at the different technology levels are compared to each other. In any case, the total sum of all technologies within a certain technology level should match with other energy source levels. Otherwise the classification of categories to the levels is flawed or the specific data entries are wrong.

Again, the comparison can be done for specific countries, or, if the selection is empty, for all countries.

In [ ]:
country_subset = []
#country_subset = ['DE', 'FR', 'IT', 'ES']
In [ ]:
data_selection = pd.DataFrame()

if len(country_subset) == 0:
    data_selection = data
else:
    for country in country_subset:
        if len(data_selection) == 0:
            data_selection = data[data.country == country]
        else:
            data_selection = data_selection.append(data[data.country == country])

#data_selection

5.1 Calculation of total capacity for energy source levels

In [ ]:
# Define the columns for grouping
groupby_selection = ['capacity_definition', 'source', 'year', 'type', 'country']

# Calculate the total capacity of all categories within a certain technology level
capacity_total_0 = pd.DataFrame(data_selection[data_selection['energy_source_level_0'] == True]
                                .groupby(groupby_selection)['capacity'].sum())
capacity_total_1 = pd.DataFrame(data_selection[data_selection['energy_source_level_1'] == True]
                                .groupby(groupby_selection)['capacity'].sum())
capacity_total_2 = pd.DataFrame(data_selection[data_selection['energy_source_level_2'] == True]
                                .groupby(groupby_selection)['capacity'].sum())
capacity_total_3 = pd.DataFrame(data_selection[data_selection['energy_source_level_3'] == True]
                                .groupby(groupby_selection)['capacity'].sum())
capacity_total_tech = pd.DataFrame(data_selection[data_selection['technology_level'] == True]
                                   .groupby(groupby_selection)['capacity'].sum())

# Merge calculated capacity for different technology levels
capacity_total_comparison = pd.DataFrame(capacity_total_0)
capacity_total_comparison = pd.merge(capacity_total_0, capacity_total_1, 
                                     left_index=True, right_index=True, how='left')
capacity_total_comparison = capacity_total_comparison.rename(columns={'capacity_x': 'energy source level 0',
                                                                      'capacity_y': 'energy source level 1'})

capacity_total_comparison = pd.merge(capacity_total_comparison, capacity_total_2, 
                                     left_index=True, right_index=True, how='left')
capacity_total_comparison = pd.merge(capacity_total_comparison, capacity_total_3, 
                                     left_index=True, right_index=True, how='left')
capacity_total_comparison = capacity_total_comparison.rename(columns={'capacity_x': 'energy source level 2',
                                                                      'capacity_y': 'energy source level 3'})

capacity_total_comparison = pd.merge(capacity_total_comparison, capacity_total_tech, 
                                     left_index=True, right_index=True, how='left')
capacity_total_comparison = capacity_total_comparison.rename(columns={'capacity': 'technology level'})

# Define sorting preferences
capacity_total_comparison = capacity_total_comparison.sortlevel(['country', 'year'])

capacity_total_comparison

5.2 Identifcation of capacity differences for energy source levels

Identification of differences between energy source levels for each country, source, and year. The difference is relative to the previous energy source level. Generally, differences between the energy source levels should be zero, but could differ in particular for ENTSO-E data.

In [ ]:
capacity_total_difference = capacity_total_comparison.diff(periods=1, axis=1)
capacity_total_difference = capacity_total_difference[(capacity_total_difference['energy source level 1'] > 0.01) | 
                                                      (capacity_total_difference['energy source level 1'] < -0.01) |
                                                     (capacity_total_difference['energy source level 2'] > 0.01) | 
                                                      (capacity_total_difference['energy source level 2'] < -0.01) |
                                                     (capacity_total_difference['energy source level 3'] > 0.01) | 
                                                      (capacity_total_difference['energy source level 3'] < -0.01)|
                                                     (capacity_total_difference['technology level'] > 0.01) | 
                                                      (capacity_total_difference['technology level'] < -0.01)]
capacity_total_difference