#!/usr/bin/env python # coding: utf-8 #
# Time series: Processing Notebook # #
This Notebook is part of the Time series Data Package of Open Power System Data. #
#

Table of Contents

#
# # 1. About Open Power System Data # This notebook is part of the project [Open Power System Data](http://open-power-system-data.org). Open Power System Data develops a platform for free and open data for electricity system modeling. We collect, check, process, document, and provide data that are publicly available but currently inconvenient to use. # More info on Open Power System Data: # - [Information on the project on our website](http://open-power-system-data.org) # - [Data and metadata on our data platform](http://data.open-power-system-data.org) # - [Data processing scripts on our GitHub page](https://github.com/Open-Power-System-Data) # # 2. About Jupyter Notebooks and GitHub # This file is a [Jupyter Notebook](http://jupyter.org/). A Jupyter Notebook is a file that combines executable programming code with visualizations and comments in markdown format, allowing for an intuitive documentation of the code. We use Jupyter Notebooks for combined coding and documentation. We use Python 3 as programming language. All Notebooks are stored on [GitHub](https://github.com/), a platform for software development, and are publicly available. More information on our IT-concept can be found [here](http://open-power-system-data.org/it). See also our [step-by-step manual](http://open-power-system-data.org/step-by-step) how to use the dataplatform. # # 3. About this datapackage # We provide data in different chunks, or [data packages](http://frictionlessdata.io/data-packages/). The one you are looking at right now, [Time series](http://data.open-power-system-data.org/time_series/), contains various kinds of time series data: # - Electricity consumption (load): forecast and actual values # - wind and solar power: generation forecast, actual generation, installed capacity, capacity factors (profiles) # - day-ahead spot prices # # In which resolution the data is published depends on the "market time unit" applied in the respective jurisdiction as well as the type of data. For most data types, the following mapping applies: # # - 15 minutes: Austria, Belgium, Germany, Hungary, Luxembourg, Netherlands # - 30 minutes: Cyprus, Ireland, United Kingdom # - 60 minutes: All other European countries # # For data that are originally available in 15 or 30 minutes resolution, hourly averages are included with the 60 minutes dataset. The original resolition data are is provided in a separate file. # The timeseries become available at different points in time depending on the sources. The full dataset is only available from 2015 onwards. # ## What is "Total load"? # # There are two sources for load data: [ENTSO-E Power Statistics](https://www.entsoe.eu/data/power-stats) (PS) and the [ENTSO-E Transparency Platform](https://transparency.entsoe.eu/load-domain/r2/totalLoadR2/show?&biddingZone.values=CTY|10Y1001A1001A83F!CTY|10Y1001A1001A83F) (TP). Both report "total load", which is defined as follows: # # $${total \ load} = total \ generation - auxilary/self-consumption \ in \ power \ plants + imports - exports - consumption \ by \ storages$$ # # Load schema # # Source: [ENTSO-E Guidelines for Monthly Statistics Data Collection](https://docstore.entsoe.eu/Documents/Publications/Statistics/MS_guidelines2016.pdf#page=6) / [Transparency Platform Detailed Data Descriptions](https://www.entsoe.eu/fileadmin/user_upload/_library/resources/Transparency/MoP%20Ref02%20-%20EMFIP-Detailed%20Data%20Descriptions%20V1R4-2014-02-24.pdf#page=11). # # The two sources differ Values on PS (~500 TWh annaually in Germany) are usually slightly higher than on the TP (~490 TWh). # The reason probably lies with different reporting deadlines: Values on the TP have to be reported # ["no later than one hour after the end of the operating period"](https://transparency.entsoe.eu/content/static_content/Static%20content/knowledge%20base/data-views/load-domain/Data-view%20Total%20Load%20-%20Day%20Ahead%20-%20Actual.html). For the PS, the data is published with a delay of up to 3 months, which might allow for more accurate metering. For a comparison of the two sources see [Hirth, et al. (2018)](https://doi.org/10.1016/j.apenergy.2018.04.048). # # For some countries, the PS report a ["represenativity factor"](https://docstore.entsoe.eu/Documents/Publications/Statistics/Specific_national_considerations.pdf#page=6) (91% in Germany until 2014, 97% since then), indicating that the reported values would have to be upscaled by this value resulting in ~520 TWh annually in Germany. # # [Schuhmacher & Hirth (2015)](http://services.feem.it/userfiles/attach/20151191122284NDL2015-088.pdf#page=11) compare the German hourly load total load values to monthly and yearly aggregate consumption statistics for Germany, showing considerable differences, part of which may be explained by the fact that none of th ENTSO-E data cover industrial auto generation which is not transported over the transmission grid. # # 4. Data sources # The main data sources are the various European Transmission System Operators (TSOs), the [ENTSO-E Power Statistics](https://www.entsoe.eu/data/power-stats) and the [ENTSO-E Transparency Platform](https://transparency.entsoe.eu). A complete list of data sources is provided on the [datapackage information website](http://data.open-power-system-data.org/time_series/). They are also contained in the JSON file that contains all metadata. # # 5. Naming conventions # # The table headers specifies each data column according to 3 categories: region, variable and attribute. **region** specifies the geographical scope according to the [ISO 3166 codes](https://en.wikipedia.org/wiki/ISO_3166). **variable** distinguishes consumption,generation and prices. **attribute** gives further properties of the data that are specific to the respective **variable**. See the table below for the set of possible combinations. # In[1]: import pandas as pd; pd.read_csv('input/notation.csv', index_col=list(range(4))) # # 6. License # This notebook as well as all other documents in this repository is published under the [MIT License](LICENSE.md).