In this notebook, we will be plotting the average magnetic field around the solar poles to determine when, exactly, the Sun's polar field flips. We will use data from the Helioseismic and Magnetic Imager instrument on NASA's Solar Dynamics Observatory (SDO) satellite, which takes about 1.5 terabytes of data a day (more data than any satellite in the NASA Heliophysics Division). And we'll plot these data using Altair, a python library for interactive data visualization, which is built on top of Vega, which is built on top of D3.
import json
import numpy as np
import matplotlib.pylab as plt
import pandas as pd
from pandas import Timestamp
import drms
import altair as alt
from datetime import datetime as dt_obj
from matplotlib.dates import *
First, we'll use drms, a python library that grabs grabs SDO the data from the JSOC database via a JSON API. The drms series hmi.meanpf_720s
contains two keywords, CAPN2
and CAPS2
, which describe the mean radial component of the magnetic field in the northern and southern polar regions (greater than 60 degrees latitude).
To start using drms, first establish a connection to JSOC:
c = drms.Client()
Then query for the most recent timestap, or T_REC
:
keys = c.query('hmi.meanpf_720s[$]', key='T_REC')
t_last = keys.T_REC[0]
Now gather all the keyword values for CAPN2
and CAPS2
for every 12-hour interval between SDO launch and the most recent T_REC
:
keys = c.query('hmi.meanpf_720s[2010.05.01_00_TAI-'+t_last+'@12h][? QUALITY=0 ?]', key='T_REC,CAPN2,CAPS2')
Now we'll convert keys.CAPN2
and keys.CAPS2
from their native format, as a pandas series, to a numpy array:
capn2 = np.array(keys.CAPN2)
caps2 = np.array(keys.CAPS2)
The error in the value of CAPN2
and CAPS2
at any point in time is defined by the standard deviation in this quantity over a 30 day period (beginning 15 days before the time of interest and ending 15 days after the time of interest):
err_capn2 = np.zeros(keys.index.stop)
err_caps2 = np.zeros(keys.index.stop)
for i in range(0,keys.index.stop):
err_capn2[i] = np.std(capn2[i-30:i+30])
for i in range(0,keys.index.stop):
err_caps2[i] = np.std(caps2[i-30:i+30])
Now we'll smooth the plot by plotting the running average of CAPN2
and CAPS2
over a 30 day window:
avg_capn2 = np.zeros(keys.index.stop)
avg_caps2 = np.zeros(keys.index.stop)
for i in range(0,keys.index.stop):
avg_capn2[i] = np.mean(capn2[i-30:i+30])
for i in range(0,keys.index.stop):
avg_caps2[i] = np.mean(caps2[i-30:i+30])
And convert these T_REC
s into Pandas Timestamps:
def parse_tai_string(tstr):
year = int(tstr[:4])
month = int(tstr[5:7])
day = int(tstr[8:10])
hour = int(tstr[11:13])
minute = int(tstr[14:16])
return pd.Timestamp(year=year, month=month, day=day, hour=hour, minute=minute)
x = np.array([parse_tai_string(keys.T_REC[i]) for i in range(keys.index.stop)])
Altair likes all the data to be in a Pandas Dataframe, so we can construct one to hold all our data:
data = {'time': list(x), 'Average CAPN2': list(avg_capn2), 'CAPN2 plus error': list(avg_capn2+err_capn2),
'CAPN2 minus error': list(avg_capn2-err_capn2), 'Average CAPS2': list(avg_caps2),
'CAPS2 plus error': list(avg_caps2+err_caps2), 'CAPS2 minus error': list(avg_caps2-err_caps2)}
df = pd.DataFrame(data, index=np.arange(keys.index.stop))
df.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 7095 entries, 0 to 7094 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 time 7095 non-null datetime64[ns] 1 Average CAPN2 7065 non-null float64 2 CAPN2 plus error 7065 non-null float64 3 CAPN2 minus error 7065 non-null float64 4 Average CAPS2 7065 non-null float64 5 CAPS2 plus error 7065 non-null float64 6 CAPS2 minus error 7065 non-null float64 dtypes: datetime64[ns](1), float64(6) memory usage: 443.4 KB
Altair likes to plot a maximum of 5000 points, otherwise the interactive elements can get slow. However, since we have more points, we can disable this default maximum value:
alt.data_transformers.disable_max_rows()
DataTransformerRegistry.enable('default')
Now for some interactive plotting with! Hover over any of the shaded regions to activate a tooltip that displays detailed information about that data point. You can also zoom in and out of the plot for more detail or drag the plot around the axes. (Note: If you're using Jupyter Lab to view this notebook, the interactive elements will show up by default. If you're using Jupyter Notebook, you must install ipyvega to see the interactive elements in the notebook).
# Create the base plot
base = alt.Chart(df).encode(
alt.X('time', axis=alt.Axis(title='Time', grid=False))
).properties(
title='Mean Polar Field',
width=800,
height=300
)
# Plot CAPN2
line= base.mark_line(color='#6394ED').encode(
y=('Average CAPN2'),
)
# Plot CAPS2
line2= base.mark_line(color='#FF4500').encode(
y=('Average CAPS2')
)
# Plot the error bars for CAPN2
area = base.mark_area(opacity=0.3, color='#6394ED').encode(
alt.Y('CAPN2 plus error', axis=alt.Axis(title='Mean Radial Field Strength (Gauss)', grid=False)),
alt.Y2('CAPN2 minus error'),
tooltip=['Average CAPN2', 'time']
).interactive()
# Plot the error bars for CAPS2
area2 = base.mark_area(opacity=0.3, color='#FF4500').encode(
alt.Y('CAPS2 plus error', axis=alt.Axis(title='Mean Radial Field Strength (Gauss)')),
alt.Y2('CAPS2 minus error'),
tooltip=['Average CAPS2','time']
).interactive()
# Create a bunch of information for the legend
source2 = alt.pd.DataFrame([{"Location": "North Pole", "Colors": "#6394ED"}, {"Location": "South Pole", "Colors": "#FF4500"}])
range_=['#6394ED', '#FF4500']
legend = alt.Chart(source2).mark_line().encode(
color=alt.Color('Location:N', scale=alt.Scale(range=range_, clamp=True),
legend=alt.Legend(title="", symbolSize=400, symbolStrokeWidth=4))
)
# Bind everything together
alt.layer(line, line2, area, area2, legend).resolve_scale(
y = 'shared',
).configure_view(
strokeWidth=0
)
To see a daily-updated and interactive version of this plot, check out the HMI Polar Field site. To create your own html file, embed the javascript from the 'View Source' option under the three dots to the right of the plot.