mlcourse.ai – Open Machine Learning Course

Author: Irina Knyazeva, ODS Slack nickname: iknyazeva

Solar flares forecasting (ML approach)

Research plan

 - Dataset and features description
 - Exploratory data analysis
 - Visual analysis of the features
 - Patterns, insights, pecularities of data
 - Data preprocessing
 - Feature engineering and description
 - Cross-validation, hyperparameter tuning
 - Validation and learning curves
 - Prediction for hold-out and test samples
 - Model evaluation with metrics description
 - Conclusions

Part 1. What is the solar flares and why we need forecast them?

The sun produces solar flares, which have the power to affect the Earth and near-Earth environment with their great bursts of electromagnetic energy and particles. These flares have the power to blow out transformers on power grids and disrupt satellite systems. There is a long lasting task of predictions such events for minimizing its negative impact. Doing so is a difficult task because of the rarity of these events. The success in this task not changes significantly over the last 60 years. Actually, this was a topic of my Ph.D. research, and I did it without any machine learning. But either in the era of big data, there is no big success in this task. The most common approach described in the paper Bobra et al., 2014. The main drawback of the approach is ignoring time dependence on features. Here I tried to use knowledge about working with time series in features. All data for this project could be downloaded from the [link]

Picture of the sun

Below the picture of our Sun in one of spectral lines $ H_\alpha $, the most beautifull one) You can see bright regions on the Sun surface, these regions called Sun active regions, and in most cases solar flares erased from such region. There is a great site (https://solarmonitor.org/) where is information about the Sun aggregated. Let's look at the Sun and there active regions

In [1]:
import matplotlib.image as mpimg
import wget
import os
import warnings
warnings.filterwarnings("ignore")
import matplotlib.pyplot as plt
%matplotlib inline
#посмотрим на размеченную картинку с solarmonitor
file_url = 'https://solarmonitor.org/data/2014/05/14/pngs/bbso/bbso_halph_fd_20140514_053834.png'
DOWNLOAD = True
IMG_PATH = '../../img/'
file_name = file_url.split(sep='/')[-1]

if DOWNLOAD:
    file_name = wget.download(file_url, out = os.path.join(IMG_PATH, file_name))
    img=mpimg.imread(file_name)
else:    
    img=mpimg.imread(os.path.join(IMG_PATH, file_name))
plt.figure(figsize = (12,12))
imgplot = plt.imshow(img)

Active regions

Active regions called active because the strength of magnetic field in this regions. Strength of magnetics fields could be taken from so-called solar magnetograms. Magnetogramm of full solar disk at the same time at the next pictures.

In [2]:
#посмотрим на размеченную картинку с solarmonitor
file_url = 'https://solarmonitor.org/data/2014/05/14/pngs/shmi/shmi_maglc_fd_20140514_224622.png'
DOWNLOAD = True
IMG_PATH = '../../img/'
file_name = file_url.split(sep='/')[-1]

if DOWNLOAD:
    file_name = wget.download(file_url, out = os.path.join(IMG_PATH, file_name))
    img=mpimg.imread(file_name)
else:    
    img=mpimg.imread(os.path.join(IMG_PATH, file_name))
plt.figure(figsize = (12,12))
imgplot = plt.imshow(img)

A solar flare occurs when magnetic energy that has built up in the solar atmosphere is suddenly released. Solar flares are an often occurrence when the Sun is active in the years around solar maximum. Many solar flares can occur on just one day during this period! Around solar minimum, solar flares might occur less than once per week.

The classification of solar flares

Solar flares are classified as A (smallest), B, C, M or X (strongest) according to the peak flux (in watts per square metre, W/m2) of 1 to 8 Ångströms X-rays near Earth, as measured by XRS instrument on-board the GOES-15 satellite which is in a geostationary orbit over the Pacific Ocean. Some (mostly stronger) solar flares can launch huge clouds of solar plasma into space which we call a coronal mass ejection. When a coronal mass ejection arrives at Earth, it can cause a geomagnetic storm and intense auroral displays.

Solar flares prediction

Most of techniques are based on the complexity of the photospheric magnetic field of the Sun's active regions. There are a large number of dif- ferent characteristics that can be used for magnetic field complexity description. Due to many empirical assumptions during their calculation, they are hardly reproducible. For HMI/SDO vector magnetograms an automated active region tracking system exist called Spaceweather HMI Active Region Patch SHARP. For each active region, key features called SHARP parameters were calculated and are available online. Computation of these features is based on SDO vector magnetograms .

One example of Active region at the picture below:

Data description

To define a solar flare, we only consider flares with a Geostationary Operational Environmental Satellite (GOES) X-ray flux peak magnitude above the M1.0 level. This allows us to focus only on major flares. For the purposes of this study, we defined a positive event to be an active region that flares with a peak magnitude above the M1.0 level, as defined by the GOES database. A negative event would be an active region that does not have such an event within a 24-hour time span. For collection active region for the negative class, we will gather also information about all regions where X-ray flux peak magnitude above the C1.0 level. So in our training set the same active region could be positive in one time and negative in the other. In each time moment, our target value will be 1 if in the next 24 hours will be the event with flux above the M1.0 level and 0 if not.

For doing that we need to describe the complexity of each active region with the features. I (and most other researchers) did it with so-called SHARP features.

The Solar Dynamic's Observatory's Helioseismic and Magnetic Imager is the first instrument to continuously map the vector magnetic field of the sun. The SDO takes the most data of any NASA satellite in history, approximately 2 terabytes per day, making it an ideal dataset for such a problem. Using this data, we can characterize active regions on the sun. From the time frame of 2010 May to 2018 December, we focused on 18 parameters calculated using the SHARP vector magnetic field data. They characterize various physical and geometrical qualities of the active region.

  1. USFLUX is the total unsigned flux.
  2. MEANGAM is the mean angle of field from radial.
  3. MEANGBT is the mean gradient of total field.
  4. MEANGBZ is the mean gradient of vertical field.
  5. MEANGBH is the mean gradient of the horizontal field.
  6. MEANJZD is the mean vertical current density.
  7. TOTUSJZ is the total unsigned vertical current.
  8. MEANALP is the mean characteristic twist parameter.
  9. MEANJZH is the mean current helicity.
  10. TOTUSJH is the total unsigned vertical current.
  11. ABSNJZH is the absolute value of the net current helicity.
  12. SAVNCPP is the sum of the modulus of the net current per polarity.
  13. MEANPOT is the mean photospheric magnetic free energy.
  14. TOTPOT is the total photospheric magnetic free energy density.
  15. MEANSHR is the mean shear angle.
  16. SHRGT45 is the fraction of area with shear greater than 45 degrees.
  17. R_VALUE is the sum of flux near polarity inversion line.
  18. AREA_ACR is the area of strong field pixels in the active region.

The following section of code initializes the start and end dates of the data set used in this study and also fetches the set of possible positive events and the mapper from NOAA active region numbers to the HARPNUMs used in our database.

Get all info about solar flares from goes

This part contain code for gathering solar data. Here data from two instruments collected: goes and SDO. There is special package sunpy for handling solar data.

In [3]:
#pip install sunpy
#pip install suds-py3
#pip install drms
from datetime import timedelta
import datetime
import sunpy
from sunpy.time import TimeRange
from sunpy.instr import goes
import numpy as np 
import pandas as pd
In [4]:
DOWNLOAD = False
DATA_PATH = '../../data/solar_flares'
if DOWNLOAD:
    time_range = TimeRange('2010/06/01 00:10', '2018/12/01 00:20')
    #time_range = TimeRange(t_start,t_end)
    goes_events = goes.get_goes_event_list(time_range,'C1')
    goes_events = pd.DataFrame(goes_events)
else:
    goes_events = pd.read_csv(os.path.join(DATA_PATH,'goes_events_2010_2018.csv'), index_col=[0])
goes_events['noaa_active_region'] = goes_events['noaa_active_region'].replace(0,np.nan)
goes_events.dropna(inplace=True)    
goes_events.drop(['goes_location','event_date','end_time','peak_time'], axis=1, inplace=True)
goes_events.start_time = goes_events.start_time.astype('datetime64[ns]')
In [5]:
goes_events.head()
Out[5]:
goes_class noaa_active_region start_time
0 M2.0 11081.0 2010-06-12 00:30:00
1 C1.0 11080.0 2010-06-12 03:57:00
2 C6.1 11081.0 2010-06-12 09:02:00
3 M1.0 11079.0 2010-06-13 05:30:00
4 C1.2 11081.0 2010-06-13 06:08:00

Active regions detections

There are different approaches to active regions detections. One of them with manual correction and done each day in NOAA. Active regions in this catalog have NOAA numbers. The team of SDO has own fully automated system of AR detections, and their regions called HARPs. Also, they compute plenty of parameters of magnetic field complexity. So I used harp regions with features, but information about goes flux there is only for NOAA regions. HARP and NOAA regions are not coinciding, but there is the mapping between this two catalogs. Below the code for mapping between the HARP and NOAA regions.

In [6]:
#download mapper NOAA
if os.path.isfile(os.path.join(DATA_PATH,'all_harps_with_noaa_ars.txt')):
    num_mapper = pd.read_csv(os.path.join(DATA_PATH,'all_harps_with_noaa_ars.txt'), sep=' ',index_col=[0])
else:
    num_mapper = pd.read_csv('http://jsoc.stanford.edu/doc/data/hmi/harpnum_to_noaa/all_harps_with_noaa_ars.txt',sep=' ')
    num_mapper.to_csv(os.path.join(DATA_PATH,'all_harps_with_noaa_ars.txt'), sep=' ')
In [7]:
num_mapper.tail()
Out[7]:
HARPNUM NOAA_ARS
1314 7304 12721
1315 7305 12722
1316 7310 12723
1317 7312 12724
1318 7313 12725
In [8]:
def convert_noaa_to_harpnum(noaa_ar):
    """
    Converts from a NOAA Active Region to a HARPNUM
    Returns harpnum if present, else None if there are no matching harpnums
    
    Args:
    """
    idx = num_mapper[num_mapper['NOAA_ARS'].str.contains(str(int(noaa_ar)))]
    return None if idx.empty else int(idx.HARPNUM.values[0])
goes_events['harp_number'] = goes_events['noaa_active_region'].apply(convert_noaa_to_harpnum)
goes_events.dropna(inplace=True)

Events class could be mapped to flux, which is continuous. It could be done with method flareclass_to_flux from goes

In [9]:
#Goes class flares better convert to flux value. It could be done with method flareclass_to_flux from goes
goes_events['flux'] =  goes_events['goes_class'].apply(lambda x: 1e06*goes.flareclass_to_flux(x).value)
goes_events.head()
Out[9]:
goes_class noaa_active_region start_time harp_number flux
0 M2.0 11081.0 2010-06-12 00:30:00 54.0 20.0
1 C1.0 11080.0 2010-06-12 03:57:00 51.0 1.0
2 C6.1 11081.0 2010-06-12 09:02:00 54.0 6.1
3 M1.0 11079.0 2010-06-13 05:30:00 49.0 10.0
4 C1.2 11081.0 2010-06-13 06:08:00 54.0 1.2

In one region could be many flares of differents classes. We have more then 1300 events and only Let's see to the countplot for the harp_number

In [15]:
import seaborn as sns
sns.countplot(x='harp_number', data = goes_events)
Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x1194a39b0>

Loading data

Data with the main features of Active regions could be taken from SDO database. There is a special package for acesssing data drms. We will download meta information with the all keywords with drms

In [16]:
#here list of keywords we want to download. Keywords computed for harp regions. 
#Here we walk through the all harp regions, download features and save them to disk (it is very time consuming)
import drms
c = drms.Client()
list_keywords = ['T_REC,CRVAL1,CRLN_OBS,USFLUX,MEANGBT,MEANJZH,MEANPOT,SHRGT45,TOTUSJH,MEANGBH,MEANALP,MEANGAM,MEANGBZ,MEANJZD,TOTUSJZ,SAVNCPP,TOTPOT,MEANSHR,AREA_ACR,R_VALUE,ABSNJZH']
harp_list = pd.unique(goes_events.harp_number)
for harp in harp_list:
    str_query = f'hmi.sharp_cea_720s[{str(int(harp))}]'

    if os.path.isfile(os.path.join(DATA_PATH+'/keys_regions',str_query+'.csv')):
        print(f'Harp number {harp} already exist\n')
    else:
        print(f'load region with Harp number {harp}')
        keys = c.query(str_query, key=list_keywords)
        keys.to_csv(os.path.join(DATA_PATH+'/keys_regions',str_query+'.csv'))
    
    
Harp number 54.0 already exist

Harp number 51.0 already exist

Harp number 49.0 already exist

Harp number 86.0 already exist

Harp number 92.0 already exist

Harp number 104.0 already exist

Harp number 115.0 already exist

Harp number 156.0 already exist

Harp number 185.0 already exist

Harp number 187.0 already exist

Harp number 190.0 already exist

Harp number 211.0 already exist

Harp number 226.0 already exist

Harp number 224.0 already exist

Harp number 245.0 already exist

Harp number 252.0 already exist

Harp number 274.0 already exist

Harp number 284.0 already exist

Harp number 297.0 already exist

Harp number 318.0 already exist

Harp number 327.0 already exist

Harp number 325.0 already exist

Harp number 345.0 already exist

Harp number 362.0 already exist

Harp number 371.0 already exist

Harp number 384.0 already exist

Harp number 377.0 already exist

Harp number 392.0 already exist

Harp number 393.0 already exist

Harp number 394.0 already exist

Harp number 407.0 already exist

Harp number 401.0 already exist

Harp number 415.0 already exist

Harp number 414.0 already exist

Harp number 421.0 already exist

Harp number 437.0 already exist

Harp number 443.0 already exist

Harp number 451.0 already exist

Harp number 466.0 already exist

Harp number 495.0 already exist

Harp number 504.0 already exist

Harp number 480.0 already exist

Harp number 514.0 already exist

Harp number 532.0 already exist

Harp number 540.0 already exist

Harp number 556.0 already exist

Harp number 580.0 already exist

Harp number 587.0 already exist

Harp number 610.0 already exist

Harp number 625.0 already exist

Harp number 637.0 already exist

Harp number 605.0 already exist

Harp number 622.0 already exist

Harp number 639.0 already exist

Harp number 650.0 already exist

Harp number 661.0 already exist

Harp number 667.0 already exist

Harp number 674.0 already exist

Harp number 684.0 already exist

Harp number 685.0 already exist

Harp number 695.0 already exist

Harp number 702.0 already exist

Harp number 725.0 already exist

Harp number 746.0 already exist

Harp number 744.0 already exist

Harp number 750.0 already exist

Harp number 753.0 already exist

Harp number 748.0 already exist

Harp number 764.0 already exist

Harp number 794.0 already exist

Harp number 798.0 already exist

Harp number 803.0 already exist

Harp number 824.0 already exist

Harp number 833.0 already exist

Harp number 814.0 already exist

Harp number 843.0 already exist

Harp number 847.0 already exist

Harp number 853.0 already exist

Harp number 856.0 already exist

Harp number 869.0 already exist

Harp number 850.0 already exist

Harp number 878.0 already exist

Harp number 892.0 already exist

Harp number 899.0 already exist

Harp number 902.0 already exist

Harp number 909.0 already exist

Harp number 932.0 already exist

Harp number 918.0 already exist

Harp number 940.0 already exist

Harp number 950.0 already exist

Harp number 956.0 already exist

Harp number 948.0 already exist

Harp number 976.0 already exist

Harp number 997.0 already exist

Harp number 1021.0 already exist

Harp number 1026.0 already exist

Harp number 1028.0 already exist

Harp number 1041.0 already exist

Harp number 1075.0 already exist

Harp number 1066.0 already exist

Harp number 1089.0 already exist

Harp number 1092.0 already exist

Harp number 1093.0 already exist

Harp number 1080.0 already exist

Harp number 1113.0 already exist

Harp number 1090.0 already exist

Harp number 1119.0 already exist

Harp number 1120.0 already exist

Harp number 1124.0 already exist

Harp number 1126.0 already exist

Harp number 1149.0 already exist

Harp number 1164.0 already exist

Harp number 1171.0 already exist

Harp number 1168.0 already exist

Harp number 1183.0 already exist

Harp number 1210.0 already exist

Harp number 1221.0 already exist

Harp number 1237.0 already exist

Harp number 1209.0 already exist

Harp number 1256.0 already exist

Harp number 1271.0 already exist

Harp number 1249.0 already exist

Harp number 1275.0 already exist

Harp number 1278.0 already exist

Harp number 1300.0 already exist

Harp number 1309.0 already exist

Harp number 1321.0 already exist

Harp number 1350.0 already exist

Harp number 1338.0 already exist

Harp number 1367.0 already exist

Harp number 1389.0 already exist

Harp number 1396.0 already exist

Harp number 1405.0 already exist

Harp number 1410.0 already exist

Harp number 1422.0 already exist

Harp number 1449.0 already exist

Harp number 1425.0 already exist

Harp number 1447.0 already exist

Harp number 1461.0 already exist

Harp number 1465.0 already exist

Harp number 1464.0 already exist

Harp number 1484.0 already exist

Harp number 1500.0 already exist

Harp number 1492.0 already exist

Harp number 1527.0 already exist

Harp number 1495.0 already exist

Harp number 1528.0 already exist

Harp number 1582.0 already exist

Harp number 1558.0 already exist

Harp number 1578.0 already exist

Harp number 1573.0 already exist

Harp number 1596.0 already exist

Harp number 1574.0 already exist

Harp number 1611.0 already exist

Harp number 1613.0 already exist

Harp number 1603.0 already exist

Harp number 1621.0 already exist

Harp number 1634.0 already exist

Harp number 1632.0 already exist

Harp number 1638.0 already exist

Harp number 1658.0 already exist

Harp number 1662.0 already exist

Harp number 1653.0 already exist

Harp number 1688.0 already exist

Harp number 1705.0 already exist

Harp number 1724.0 already exist

Harp number 1722.0 already exist

Harp number 1750.0 already exist

Harp number 1744.0 already exist

Harp number 1756.0 already exist

Harp number 1795.0 already exist

Harp number 1806.0 already exist

Harp number 1807.0 already exist

Harp number 1834.0 already exist

Harp number 1844.0 already exist

Harp number 1866.0 already exist

Harp number 1873.0 already exist

Harp number 1877.0 already exist

Harp number 1879.0 already exist

Harp number 1886.0 already exist

Harp number 1903.0 already exist

Harp number 1907.0 already exist

Harp number 1930.0 already exist

Harp number 1931.0 already exist

Harp number 1946.0 already exist

Harp number 1962.0 already exist

Harp number 1996.0 already exist

Harp number 1959.0 already exist

Harp number 1993.0 already exist

Harp number 1999.0 already exist

Harp number 1990.0 already exist

Harp number 2007.0 already exist

Harp number 2017.0 already exist

Harp number 2021.0 already exist

Harp number 2026.0 already exist

Harp number 2040.0 already exist

Harp number 2044.0 already exist

Harp number 2039.0 already exist

Harp number 2047.0 already exist

Harp number 2061.0 already exist

Harp number 2037.0 already exist

Harp number 2069.0 already exist

Harp number 2059.0 already exist

Harp number 2109.0 already exist

Harp number 2110.0 already exist

Harp number 2098.0 already exist

Harp number 2112.0 already exist

Harp number 2121.0 already exist

Harp number 2117.0 already exist

Harp number 2123.0 already exist

Harp number 2130.0 already exist

Harp number 2137.0 already exist

Harp number 2186.0 already exist

Harp number 2173.0 already exist

Harp number 2177.0 already exist

Harp number 2178.0 already exist

Harp number 2193.0 already exist

Harp number 2181.0 already exist

Harp number 2191.0 already exist

Harp number 2220.0 already exist

Harp number 2203.0 already exist

Harp number 2227.0 already exist

Harp number 2245.0 already exist

Harp number 2262.0 already exist

Harp number 2240.0 already exist

Harp number 2270.0 already exist

Harp number 2291.0 already exist

Harp number 2306.0 already exist

Harp number 2314.0 already exist

Harp number 2331.0 already exist

Harp number 2337.0 already exist

Harp number 2329.0 already exist

Harp number 2360.0 already exist

Harp number 2362.0 already exist

Harp number 2353.0 already exist

Harp number 2366.0 already exist

Harp number 2372.0 already exist

Harp number 2344.0 already exist

Harp number 2338.0 already exist

Harp number 2371.0 already exist

Harp number 2400.0 already exist

Harp number 2420.0 already exist

Harp number 2433.0 already exist

Harp number 2450.0 already exist

Harp number 2460.0 already exist

Harp number 2491.0 already exist

Harp number 2469.0 already exist

Harp number 2520.0 already exist

Harp number 2519.0 already exist

Harp number 2557.0 already exist

Harp number 2533.0 already exist

Harp number 2546.0 already exist

Harp number 2522.0 already exist

Harp number 2560.0 already exist

Harp number 2541.0 already exist

Harp number 2543.0 already exist

Harp number 2605.0 already exist

Harp number 2597.0 already exist

Harp number 2635.0 already exist

Harp number 2634.0 already exist

Harp number 2636.0 already exist

Harp number 2626.0 already exist

Harp number 2619.0 already exist

Harp number 2651.0 already exist

Harp number 2661.0 already exist

Harp number 2663.0 already exist

Harp number 2673.0 already exist

Harp number 2693.0 already exist

Harp number 2696.0 already exist

Harp number 2677.0 already exist

Harp number 2691.0 already exist

Harp number 2716.0 already exist

Harp number 2718.0 already exist

Harp number 2727.0 already exist

Harp number 2710.0 already exist

Harp number 2735.0 already exist

Harp number 2733.0 already exist

Harp number 2739.0 already exist

Harp number 2737.0 already exist

Harp number 2748.0 already exist

Harp number 2754.0 already exist

Harp number 2750.0 already exist

Harp number 2760.0 already exist

Harp number 2758.0 already exist

Harp number 2779.0 already exist

Harp number 2809.0 already exist

Harp number 2790.0 already exist

Harp number 2825.0 already exist

Harp number 2852.0 already exist

Harp number 2875.0 already exist

Harp number 2861.0 already exist

Harp number 2878.0 already exist

Harp number 2887.0 already exist

Harp number 2912.0 already exist

Harp number 2920.0 already exist

Harp number 2922.0 already exist

Harp number 2952.0 already exist

Harp number 2954.0 already exist

Harp number 2968.0 already exist

Harp number 2981.0 already exist

Harp number 2999.0 already exist

Harp number 3028.0 already exist

Harp number 3056.0 already exist

Harp number 3048.0 already exist

Harp number 3068.0 already exist

Harp number 3082.0 already exist

Harp number 3113.0 already exist

Harp number 3103.0 already exist

Harp number 3097.0 already exist

Harp number 3066.0 already exist

Harp number 3098.0 already exist

Harp number 3129.0 already exist

Harp number 3119.0 already exist

Harp number 3149.0 already exist

Harp number 3194.0 already exist

Harp number 3199.0 already exist

Harp number 3188.0 already exist

Harp number 3247.0 already exist

Harp number 3248.0 already exist

Harp number 3258.0 already exist

Harp number 3263.0 already exist

Harp number 3273.0 already exist

Harp number 3267.0 already exist

Harp number 3291.0 already exist

Harp number 3288.0 already exist

Harp number 3295.0 already exist

Harp number 3311.0 already exist

Harp number 3321.0 already exist

Harp number 3336.0 already exist

Harp number 3341.0 already exist

Harp number 3344.0 already exist

Harp number 3330.0 already exist

Harp number 3364.0 already exist

Harp number 3366.0 already exist

Harp number 3362.0 already exist

Harp number 3376.0 already exist

Harp number 3420.0 already exist

Harp number 3415.0 already exist

Harp number 3432.0 already exist

Harp number 3437.0 already exist

Harp number 3457.0 already exist

Harp number 3448.0 already exist

Harp number 3473.0 already exist

Harp number 3483.0 already exist

Harp number 3497.0 already exist

Harp number 3515.0 already exist

Harp number 3520.0 already exist

Harp number 3535.0 already exist

Harp number 3542.0 already exist

Harp number 3563.0 already exist

Harp number 3580.0 already exist

Harp number 3587.0 already exist

Harp number 3604.0 already exist

Harp number 3608.0 already exist

Harp number 3647.0 already exist

Harp number 3631.0 already exist

Harp number 3686.0 already exist

Harp number 3688.0 already exist

Harp number 3730.0 already exist

Harp number 3721.0 already exist

Harp number 3719.0 already exist

Harp number 3740.0 already exist

Harp number 3711.0 already exist

Harp number 3766.0 already exist

Harp number 3784.0 already exist

Harp number 3793.0 already exist

Harp number 3779.0 already exist

Harp number 3804.0 already exist

Harp number 3813.0 already exist

Harp number 3836.0 already exist

Harp number 3845.0 already exist

Harp number 3824.0 already exist

Harp number 3848.0 already exist

Harp number 3856.0 already exist

Harp number 3879.0 already exist

Harp number 3874.0 already exist

Harp number 3877.0 already exist

Harp number 3894.0 already exist

Harp number 3907.0 already exist

Harp number 3942.0 already exist

Harp number 3926.0 already exist

Harp number 3957.0 already exist

Harp number 3941.0 already exist

Harp number 3912.0 already exist

Harp number 4000.0 already exist

Harp number 3999.0 already exist

Harp number 4025.0 already exist

Harp number 3996.0 already exist

Harp number 4040.0 already exist

Harp number 3985.0 already exist

Harp number 4042.0 already exist

Harp number 4065.0 already exist

Harp number 4075.0 already exist

Harp number 4073.0 already exist

Harp number 4071.0 already exist

Harp number 4088.0 already exist

Harp number 4092.0 already exist

Harp number 4097.0 already exist

Harp number 4111.0 already exist

Harp number 4133.0 already exist

Harp number 4131.0 already exist

Harp number 4156.0 already exist

Harp number 4138.0 already exist

Harp number 4190.0 already exist

Harp number 4186.0 already exist

Harp number 4197.0 already exist

Harp number 4179.0 already exist

Harp number 4225.0 already exist

Harp number 4231.0 already exist

Harp number 4205.0 already exist

Harp number 4252.0 already exist

Harp number 4265.0 already exist

Harp number 4272.0 already exist

Harp number 4288.0 already exist

Harp number 4296.0 already exist

Harp number 4294.0 already exist

Harp number 4321.0 already exist

Harp number 4351.0 already exist

Harp number 4344.0 already exist

Harp number 4379.0 already exist

Harp number 4390.0 already exist

Harp number 4383.0 already exist

Harp number 4396.0 already exist

Harp number 4424.0 already exist

Harp number 4422.0 already exist

Harp number 4610.0 already exist

Harp number 4616.0 already exist

Harp number 4639.0 already exist

Harp number 4678.0 already exist

Harp number 4698.0 already exist

Harp number 4711.0 already exist

Harp number 4760.0 already exist

Harp number 4781.0 already exist

Harp number 4764.0 already exist

Harp number 4800.0 already exist

Harp number 4810.0 already exist

Harp number 4817.0 already exist

Harp number 4851.0 already exist

Harp number 4862.0 already exist

Harp number 4868.0 already exist

Harp number 4874.0 already exist

Harp number 4872.0 already exist

Harp number 4879.0 already exist

Harp number 4888.0 already exist

Harp number 4900.0 already exist

Harp number 4932.0 already exist

Harp number 4921.0 already exist

Harp number 4908.0 already exist

Harp number 4920.0 already exist

Harp number 4941.0 already exist

Harp number 4942.0 already exist

Harp number 4963.0 already exist

Harp number 4969.0 already exist

Harp number 4978.0 already exist

Harp number 4955.0 already exist

Harp number 5002.0 already exist

Harp number 4995.0 already exist

Harp number 5011.0 already exist

Harp number 5005.0 already exist

Harp number 5026.0 already exist

Harp number 5039.0 already exist

Harp number 5036.0 already exist

Harp number 5022.0 already exist

Harp number 5051.0 already exist

Harp number 5028.0 already exist

Harp number 5075.0 already exist

Harp number 5107.0 already exist

Harp number 5111.0 already exist

Harp number 5118.0 already exist

Harp number 5127.0 already exist

Harp number 5113.0 already exist

Harp number 5112.0 already exist

Harp number 5144.0 already exist

Harp number 5151.0 already exist

Harp number 5186.0 already exist

Harp number 5183.0 already exist

Harp number 5198.0 already exist

Harp number 5233.0 already exist

Harp number 5246.0 already exist

Harp number 5249.0 already exist

Harp number 5298.0 already exist

Harp number 5326.0 already exist

Harp number 5315.0 already exist

Harp number 5354.0 already exist

Harp number 5342.0 already exist

Harp number 5366.0 already exist

Harp number 5375.0 already exist

Harp number 5422.0 already exist

Harp number 5415.0 already exist

Harp number 5447.0 already exist

Harp number 4376.0 already exist

Harp number 5446.0 already exist

Harp number 5492.0 already exist

Harp number 5462.0 already exist

Harp number 5526.0 already exist

Harp number 5537.0 already exist

Harp number 5541.0 already exist

Harp number 5549.0 already exist

Harp number 5545.0 already exist

Harp number 5571.0 already exist

Harp number 5586.0 already exist

Harp number 5596.0 already exist

Harp number 5618.0 already exist

Harp number 5637.0 already exist

Harp number 5644.0 already exist

Harp number 5653.0 already exist

Harp number 5658.0 already exist

Harp number 5673.0 already exist

Harp number 5099.0 already exist

Harp number 5675.0 already exist

Harp number 5692.0 already exist

Harp number 5718.0 already exist

Harp number 5738.0 already exist

Harp number 5745.0 already exist

Harp number 5739.0 already exist

Harp number 5789.0 already exist

Harp number 5783.0 already exist

Harp number 5808.0 already exist

Harp number 5807.0 already exist

Harp number 5848.0 already exist

Harp number 5831.0 already exist

Harp number 5880.0 already exist

Harp number 5885.0 already exist

Harp number 5916.0 already exist

Harp number 5894.0 already exist

Harp number 5944.0 already exist

Harp number 5956.0 already exist

Harp number 5963.0 already exist

Harp number 5974.0 already exist

Harp number 5983.0 already exist

Harp number 5976.0 already exist

Harp number 5982.0 already exist

Harp number 5990.0 already exist

Harp number 5991.0 already exist

Harp number 6015.0 already exist

Harp number 6026.0 already exist

Harp number 6027.0 already exist

Harp number 6063.0 already exist

Harp number 6054.0 already exist

Harp number 6052.0 already exist

Harp number 6075.0 already exist

Harp number 6078.0 already exist

Harp number 6069.0 already exist

Harp number 6084.0 already exist

Harp number 6100.0 already exist

Harp number 6124.0 already exist

Harp number 6136.0 already exist

Harp number 6155.0 already exist

Harp number 6154.0 already exist

Harp number 6164.0 already exist

Harp number 6167.0 already exist

Harp number 6172.0 already exist

Harp number 6178.0 already exist

Harp number 6205.0 already exist

Harp number 6206.0 already exist

Harp number 6242.0 already exist

Harp number 6258.0 already exist

Harp number 6281.0 already exist

Harp number 6289.0 already exist

Harp number 6301.0 already exist

Harp number 6317.0 already exist

Harp number 6319.0 already exist

Harp number 6320.0 already exist

Harp number 6302.0 already exist

Harp number 6327.0 already exist

Harp number 6356.0 already exist

Harp number 6353.0 already exist

Harp number 6359.0 already exist

Harp number 6383.0 already exist

Harp number 6398.0 already exist

Harp number 6418.0 already exist

Harp number 6403.0 already exist

Harp number 6424.0 already exist

Harp number 6427.0 already exist

Harp number 6437.0 already exist

Harp number 6483.0 already exist

Harp number 6507.0 already exist

Harp number 6523.0 already exist

Harp number 6537.0 already exist

Harp number 6532.0 already exist

Harp number 6544.0 already exist

Harp number 6550.0 already exist

Harp number 6558.0 already exist

Harp number 6566.0 already exist

Harp number 6591.0 already exist

Harp number 6599.0 already exist

Harp number 6617.0 already exist

Harp number 6666.0 already exist

Harp number 6670.0 already exist

Harp number 6688.0 already exist

Harp number 6692.0 already exist

Harp number 6699.0 already exist

Harp number 6711.0 already exist

Harp number 6716.0 already exist

Harp number 6731.0 already exist

Harp number 6764.0 already exist

Harp number 6754.0 already exist

Harp number 6769.0 already exist

Harp number 6794.0 already exist

Harp number 6846.0 already exist

Harp number 6870.0 already exist

Harp number 6920.0 already exist

Harp number 6922.0 already exist

Harp number 6949.0 already exist

Harp number 6952.0 already exist

Harp number 6972.0 already exist

Harp number 6975.0 already exist

Harp number 6982.0 already exist

Harp number 6986.0 already exist

Harp number 7022.0 already exist

Harp number 7034.0 already exist

Harp number 7075.0 already exist

Harp number 7107.0 already exist

Harp number 7110.0 already exist

Harp number 7117.0 already exist

Harp number 7115.0 already exist

Harp number 7122.0 already exist

Harp number 7131.0 already exist

Harp number 7148.0 already exist

Harp number 7169.0 already exist

Harp number 7237.0 already exist

Harp number 7240.0 already exist

Harp number 7246.0 already exist

Harp number 7262.0 already exist

Harp number 7275.0 already exist

Part 2. Exploratory data analysis

Now there are many CSV files for each harp region and we can analyze the evolution of different parameters with the time. It is believed that before the flares complexity of magnetic field changes and there are special patterns in features evolutions.

In [17]:
def plot_harp_features_flares(harp, goes_events = goes_events, DATA_PATH=DATA_PATH, feature_key = 'R_VALUE'):
    str_query = f'hmi.sharp_cea_720s[{str(int(harp))}]'
    df = pd.read_csv(os.path.join(DATA_PATH+'/keys_regions',str_query+'.csv'), index_col=[0])
    df['T_REC']  = drms.to_datetime(df.T_REC)
    df.set_index('T_REC', inplace=True)
    first_date = df.index.get_values()[0]

    is_visible =  abs(df['CRVAL1']-df['CRLN_OBS'])<60
    df = df[is_visible]

    flux = goes_events[goes_events['harp_number']==harp][['start_time','flux']].set_index('start_time')
    #plt.figure(figsize = (10,14))
    fig, ax1 = plt.subplots(figsize=(15,5))
    #ax1.figure(figsize = (10,14))
    first_date = flux.index.get_values()[0]
    first_date = df.index.get_values()[0]
    #t2 = flux.index.get_values()[0]
    #first_data = min(t1,t2)
    dates_to_show = pd.date_range(pd.Timestamp(first_date).strftime('%m/%d/%Y'), periods=14, freq='d')
    labels = dates_to_show.strftime('%b %d')
    color = 'tab:green'
    ax1.plot(df.index, df[feature_key], color=color)
    ax2 = ax1.twinx()
    ax2.bar(flux.index, flux.flux, width=0.05, facecolor='indianred')
    plt.setp(ax1, xticks=dates_to_show, xticklabels=labels);
    #ax2.set_ylim(0,10)
In [18]:
harp = 6327
plot_harp_features_flares(harp, feature_key = 'R_VALUE')