The carbon dioxide record from Mauna Loa Observatory, known as the “Keeling Curve,” is the world’s longest unbroken record of atmospheric carbon dioxide concentrations. Scientists make atmospheric measurements in remote locations to sample air that is representative of a large volume of Earth’s atmosphere and relatively free from local influences.
The data in this notebook is a combination of data collected at the Mauna Loa Observatory (MLO), with datasets from NOAA and also from UC San Diego. The NOAA dataset only goes back until 1974 while the UCSD dataset has recordings going back until 1958 when the observatory opened. This notebook combines the two datasets and takes a look at the trends over the years.
%%HTML
<iframe width="640" height="360" src="https://www.youtube.com/embed/1ZQG59_z83I" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
%%HTML
<iframe width="640" height="360" src="https://www.youtube.com/embed/x1SgmFa0r04" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
from datetime import datetime
import altair as alt
import pandas as pd
ucsd_co2_data = pd.read_csv('data_acda/ucsd_co2_data.csv').rename(columns={'Carbon Dioxide (ppm)': 'CO2 (ppm)'})
ucsd_co2_data['Date'] = pd.to_datetime(ucsd_co2_data['Year'].astype(str) + ' ' + ucsd_co2_data['Month'].astype(str))
ucsd_co2_data
Year | Month | Decimal Date | CO2 (ppm) | Seasonally Adjusted CO2 (ppm) | Carbon Dioxide Fit (ppm) | Seasonally Adjusted CO2 Fit (ppm) | Date | |
---|---|---|---|---|---|---|---|---|
0 | 1958 | 1 | 1958.0411 | NaN | NaN | NaN | NaN | 1958-01-01 |
1 | 1958 | 2 | 1958.1260 | NaN | NaN | NaN | NaN | 1958-02-01 |
2 | 1958 | 3 | 1958.2027 | 315.69 | 314.42 | 316.18 | 314.89 | 1958-03-01 |
3 | 1958 | 4 | 1958.2877 | 317.45 | 315.15 | 317.30 | 314.98 | 1958-04-01 |
4 | 1958 | 5 | 1958.3699 | 317.50 | 314.73 | 317.83 | 315.06 | 1958-05-01 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
715 | 2017 | 8 | 2017.6219 | NaN | NaN | NaN | NaN | 2017-08-01 |
716 | 2017 | 9 | 2017.7068 | NaN | NaN | NaN | NaN | 2017-09-01 |
717 | 2017 | 10 | 2017.7890 | NaN | NaN | NaN | NaN | 2017-10-01 |
718 | 2017 | 11 | 2017.8740 | NaN | NaN | NaN | NaN | 2017-11-01 |
719 | 2017 | 12 | 2017.9562 | NaN | NaN | NaN | NaN | 2017-12-01 |
720 rows × 8 columns
alt.Chart(ucsd_co2_data).mark_line().encode(
x = alt.X('Date', type='temporal'),
y = alt.Y('CO2 (ppm)', type='quantitative', scale=alt.Scale(zero=False)),
color = alt.value('blue')
).properties(title="MLO Carbon Dioxide in PPM over Time (UCSD Data)", width=700).interactive()
The datasets from NOAA are text files that need to be processed into DataFrames. Here is an excerpt from the dataset:
%%bash
tail +48 data_acda/co2_weekly_mlo.txt | head -n 5
# Start of week CO2 molfrac (-999.99 = no data) increase # (yr, mon, day, decimal) (ppm) #days 1 yr ago 10 yr ago since 1800 1974 5 19 1974.3795 333.34 6 -999.99 -999.99 50.36 1974 5 26 1974.3986 332.95 6 -999.99 -999.99 50.06 1974 6 2 1974.4178 332.32 5 -999.99 -999.99 49.57
mlo_co2_data = {
'date': [], 'year': [], 'month': [], 'day': [],
'decimal date': [], 'CO2 (ppm)': [], '#days': []
}
with open('data_acda/co2_weekly_mlo.txt', 'r') as file:
raw_data = file.readlines()[50:]
for row in raw_data:
data = row.split()
if data[4] == '-999.99':
continue
mlo_co2_data['year'].append(data[0])
mlo_co2_data['month'].append(data[1])
mlo_co2_data['day'].append(data[2])
mlo_co2_data['decimal date'].append(data[3])
mlo_co2_data['CO2 (ppm)'].append(data[4])
mlo_co2_data['#days'].append(data[5])
date = datetime(year=int(data[0]), month=int(data[1]), day=int(data[2]))
mlo_co2_data['date'].append(date)
mlo_co2_data = pd.DataFrame(mlo_co2_data)
mlo_co2_data.drop(index=mlo_co2_data[mlo_co2_data['CO2 (ppm)'] == '-999.99'].index)
mlo_co2_data
date | year | month | day | decimal date | CO2 (ppm) | #days | |
---|---|---|---|---|---|---|---|
0 | 1974-05-26 | 1974 | 5 | 26 | 1974.3986 | 332.95 | 6 |
1 | 1974-06-02 | 1974 | 6 | 2 | 1974.4178 | 332.32 | 5 |
2 | 1974-06-09 | 1974 | 6 | 9 | 1974.4370 | 332.18 | 7 |
3 | 1974-06-16 | 1974 | 6 | 16 | 1974.4562 | 332.37 | 7 |
4 | 1974-06-23 | 1974 | 6 | 23 | 1974.4753 | 331.59 | 6 |
... | ... | ... | ... | ... | ... | ... | ... |
2405 | 2020-11-15 | 2020 | 11 | 15 | 2020.8730 | 412.53 | 6 |
2406 | 2020-11-22 | 2020 | 11 | 22 | 2020.8921 | 413.84 | 6 |
2407 | 2020-11-29 | 2020 | 11 | 29 | 2020.9112 | 413.76 | 7 |
2408 | 2020-12-06 | 2020 | 12 | 6 | 2020.9303 | 413.39 | 7 |
2409 | 2020-12-13 | 2020 | 12 | 13 | 2020.9495 | 413.92 | 7 |
2410 rows × 7 columns
alt.Chart(mlo_co2_data).mark_line().encode(
x = alt.X('date', type='temporal'),
y = alt.Y('CO2 (ppm)', type='quantitative', scale=alt.Scale(zero=False)),
color = alt.value('green')
).properties(title='MLO Carbon Dioxide in PPM over Time (NOAA Data)', width=700).interactive()
global_co2_data = {'date': [], 'year': [], 'month': [], 'day': [], 'cycle': [], 'trend': []}
with open('data_acda/co2_trend_gl.txt', 'r') as file:
raw_data = file.readlines()[60:]
for row in raw_data:
data = row.split()
year = data[0]
month = data[1]
day = data[2]
cycle = data[3]
trend = data[4]
global_co2_data['year'].append(year)
global_co2_data['month'].append(month)
global_co2_data['day'].append(day)
global_co2_data['cycle'].append(cycle)
global_co2_data['trend'].append(trend)
date = datetime(year=int(year), month=int(month), day=int(day))
global_co2_data['date'].append(str(date))
global_co2_data = pd.DataFrame(global_co2_data)
global_co2_data
date | year | month | day | cycle | trend | |
---|---|---|---|---|---|---|
0 | 2010-01-01 00:00:00 | 2010 | 1 | 1 | 388.28 | 387.23 |
1 | 2010-01-02 00:00:00 | 2010 | 1 | 2 | 388.30 | 387.24 |
2 | 2010-01-03 00:00:00 | 2010 | 1 | 3 | 388.32 | 387.25 |
3 | 2010-01-04 00:00:00 | 2010 | 1 | 4 | 388.34 | 387.25 |
4 | 2010-01-05 00:00:00 | 2010 | 1 | 5 | 388.36 | 387.26 |
... | ... | ... | ... | ... | ... | ... |
3685 | 2020-02-03 00:00:00 | 2020 | 2 | 3 | 413.52 | 411.77 |
3686 | 2020-02-04 00:00:00 | 2020 | 2 | 4 | 413.54 | 411.78 |
3687 | 2020-02-05 00:00:00 | 2020 | 2 | 5 | 413.55 | 411.78 |
3688 | 2020-02-06 00:00:00 | 2020 | 2 | 6 | 413.57 | 411.79 |
3689 | 2020-02-07 00:00:00 | 2020 | 2 | 7 | 413.58 | 411.80 |
3690 rows × 6 columns
alt.Chart(global_co2_data).mark_line().encode(
x = alt.X('date', type='temporal'),
y = alt.Y('cycle', type='quantitative', scale=alt.Scale(zero=False)),
color = alt.value('red')
).properties(title="Global Carbon Dioxide Trends in PPM over Time (NOAA Data)", width=700).interactive()