datiaperti logo

Air Quality Forecasts

The CAMS service provides the Air Quality Forecasts for Europe. The data product is based on an ensemble of the outputs of nine Earth System models that provides the forecasts for many polluttants such as NOx, Ozone, particulate matter (PM2.5 and PM10) and SO2 with a spatial resolution of 10 km. In this test we want to retrieve the forecasts for NO2 over the Italian peninsula at ground level every hour for the next 96 hours (lead times) and visualize the forecasts as a sequence of frames in an animation. More information about the origin of NO2 and its relevance on the air quality is available at the EUMETSAT website

In [2]:
import xarray as xr
import pandas as pd
import numpy as np
import cdsapi
from platform import python_version
print("python version: %s"%python_version())
print("pandas version: %s"%pd.__version__)
print("xarray version: %s"%xr.__version__)
python version: 3.8.2
pandas version: 1.0.2
xarray version: 0.14.0

The Copernicus Atmosphere Monitoring Service

The Copernicus Atmosphere Monitoring Service (CAMS) provides analysis and forecasts related to air quality, atmospheric composition, greenhouse gases, solar irradiance. The datasets released by the CAMS service is the result of assimilation processes in which observations from satellites and ground stations are used to update a numerical model. The CAMS is operated by the European Centre for Medium-Range Weather Forecasts (ECMWF) on behalf of the European Commission.

The European air quality forecasts

The CAMS provides its datasets as open data, available to all for free, through a web page and a web service. For the air quality forecasts a user can select, among other options

  • the variables, that is the physical parameters we are interested in
  • the model we want to use, nine models are available plus the ensemble
  • the levels, or heights for which we want the forecasts
  • the area of interest, delimited by north and south latitudes and west and east longitudes
  • the date from which the forecast should start
  • the lead time hours, that is the hours of the forecasts
  • the format of the data (CSV, NetCDF)

In this notebook we use the web service through its API. In order to use the API, you have to be registered into the CADS and

  1. login
  2. copy your ADS API key in the .condarc file in your home folder
  3. install the cdsapi Python package

For more information follow the how-to instructions.

We set some parameter before submitting our request, in particular the area of interest, the variable (NO2), the lead time hours.

In [11]:
path = 'data/air_quality_forecasts/20201206'
file_name = 'air_quality_forecasts_europe_20201206.nc'
date = '2020-12-06'
area_north = 47.12
area_south = 36.40
area_west = 6.57
area_east = 18.52
In [6]:
#URL = "https://ads.atmosphere.copernicus.eu/api/v2"
#KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
#c = cdsapi.Client(url=URL, key=KEY)

c = cdsapi.Client()

c.retrieve(
    'cams-europe-air-quality-forecasts',
    {
        'model': 'ensemble',
        'date': date + '/' + date,
        'format': 'netcdf',
        'variable': 'nitrogen_dioxide',
        'level': '0',
        'type': 'forecast',
        'time': '00:00',
        'leadtime_hour': [
            '0', '1', '10',
            '11', '12', '13',
            '14', '15', '16',
            '17', '18', '19',
            '2', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '3',
            '30', '31', '32',
            '33', '34', '35',
            '36', '37', '38',
            '39', '4', '40',
            '41', '42', '43',
            '44', '45', '46',
            '47', '48', '49',
            '5', '50', '51',
            '52', '53', '54',
            '55', '56', '57',
            '58', '59', '6',
            '60', '61', '62',
            '63', '64', '65',
            '66', '67', '68',
            '69', '7', '70',
            '71', '72', '73',
            '74', '75', '76',
            '77', '78', '79',
            '8', '80', '81',
            '82', '83', '84',
            '85', '86', '87',
            '88', '89', '9',
            '90', '91', '92',
            '93', '94', '95',
            '96',
        ],
        'area': [
            area_north, area_west, area_south,
            area_east,
        ],
    },
    path + '/' + file_name)
2020-12-07 18:34:24,352 INFO Welcome to the CDS
2020-12-07 18:34:24,355 INFO Sending request to https://ads.atmosphere.copernicus.eu/api/v2/resources/cams-europe-air-quality-forecasts
2020-12-07 18:34:24,455 INFO Request is queued
2020-12-07 18:34:25,526 INFO Request is running
2020-12-07 18:34:45,696 INFO Request is completed
2020-12-07 18:34:45,699 INFO Downloading http://136.156.133.82/cache-compute-0003/cache/data5/adaptor.cams_regional_fc.retrieve-1607358882.5547416-18349-14-9c9ef5d2-8d48-4010-bc41-a46920a88854.nc to data/air_quality_forecasts/20201203/air_quality_forecasts_europe_20201203.nc (4.7M)
2020-12-07 18:34:48,604 INFO Download rate 1.6M/s   
Out[6]:
Result(content_length=4942928,content_type=application/x-netcdf,location=http://136.156.133.82/cache-compute-0003/cache/data5/adaptor.cams_regional_fc.retrieve-1607358882.5547416-18349-14-9c9ef5d2-8d48-4010-bc41-a46920a88854.nc)

If we have requested the forecasts starting from only day, let's say the day in which we send the request, the webservice will send a netcdf file with the forecasts for all the lead times we have specified. If we want the forecasts starting from more days the webservice will send a zip file that contains one netCDF file for each start day. In our test we will request the forecasts starting from one single day, the same day in which we send the request.

In [8]:
#import os
#import zipfile
#with zipfile.ZipFile('data/air_quality_forecasts_europe_20201203.zip', 'r') as zip_ref:
#    zip_ref.extractall(path)

#nc_files = os.listdir(path)
#nc_files

Let's open the file to get the NO2 forecasts

In [9]:
def getforecast(path, file_name):
    # The forecasts for NO2 are in the 'no2_conc' variable. 
    # The function puts the data in memory, closes the file
    # and returns the NO2 forecast
    ds = xr.open_dataset(path + '/' + file_name)
    no2_forecast = ds['no2_conc']
    ds.close()
    return no2_forecast
In [54]:
no2_forecasts = getforecast(path, file_name)
no2_forecasts
Out[54]:
<xarray.DataArray 'no2_conc' (time: 97, level: 1, latitude: 107, longitude: 119)>
[1235101 values with dtype=float32]
Coordinates:
  * longitude  (longitude) float32 6.65 6.75 6.85 6.95 ... 18.25 18.35 18.45
  * latitude   (latitude) float32 47.05 46.95 46.85 46.75 ... 36.65 36.55 36.45
  * level      (level) float32 0.0
  * time       (time) timedelta64[ns] 00:00:00 01:00:00 ... 4 days 00:00:00
Attributes:
    species:        Nitrogen Dioxide
    units:          µg/m3
    value:          hourly values
    standard_name:  mass_concentration_of_nitrogen_dioxide_in_air

As we can see from the description, the time interval of each forecast is provided as a delta in nanoseconds (ns), from the first lead time to the next. Since we have requested a forecast every hour the delta is 3600 * 10^9, or 3600000000000, nanoseconds. We will create a date time index to be used later on in the plots

In [55]:
delta_time = 3600000000000 # delta between one lead time and the next one
num_forecasts = len(no2_forecasts)
start_day = pd.to_datetime(date)
date_index = start_day + pd.to_timedelta(np.arange(num_forecasts), 'H')
date_index.size
Out[55]:
97

We plot the forecast for one lead time hour to check that everything works fine and then we'll create an animation using all the forecasts.

In [56]:
import matplotlib
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy
print("matplotlib version: %s"%matplotlib.__version__)
print("cartopy version: %s"%cartopy.__version__)
matplotlib version: 3.3.1
cartopy version: 0.18.0

We create the figure that we will use to plot the data

In [57]:
def create_figure(): 
    fig = plt.figure(figsize=(20,10))
    ax = plt.axes(projection=ccrs.PlateCarree())
    ax.add_feature(cartopy.feature.OCEAN)
    ax.add_feature(cartopy.feature.LAND, edgecolor='black')
    ax.coastlines()
    return fig, ax

Let's plot the first forecast.

In [51]:
_, ax = create_figure()
lead_time = 96
first_forecast = no2_forecasts.sel(level = 0.0, time = delta_time * lead_time)
first_forecast.plot.pcolormesh(ax=ax, 
                                x='longitude', 
                                y='latitude', 
                                add_colorbar=True, cmap='viridis')
plt.title('Air Quality Forecast - {0:s}'.format(date_index[lead_time - 1].strftime('%Y-%m-%d %H:%M:%S')))
Out[51]:
Text(0.5, 1.0, 'Air Quality Forecast - 2020-12-09 23:00:00')

Now we create the animation using all the forecasts at ground level. The animation will contain a frame for each forecast (or lead time)

In [52]:
import matplotlib.animation as animation
from IPython.display import HTML, display

forecasts = no2_forecasts.sel(level = 0.0)

def draw(frame, add_colorbar):    
    forecast_frame = forecasts.sel(time = delta_time * frame)
    plot = forecast_frame.plot.pcolormesh(ax=ax, x='longitude', y='latitude', add_colorbar=add_colorbar, cmap='viridis')
    title = 'Air Quality Forecast - {0:s}'.format(date_index[frame - 1].strftime('%Y-%m-%d %H:%M:%S'))
    ax.set_title(title)
    return plot

def init():
    return draw(1, add_colorbar=True)

def animate(frame):
    plt.close()
    return draw(frame + 1, add_colorbar=False)

fig, ax = create_figure()


ani = animation.FuncAnimation(fig, animate, frames=96, interval=500, blit=False, init_func=init, repeat=False)

HTML(ani.to_jshtml())
2020-12-07 19:28:26,883 INFO Animation.save using <class 'matplotlib.animation.HTMLWriter'>
Out[52]: