# Replicate Vanderburg & Johnson 2014 K2SFF Method¶

In this notebook we will replicate the K2SFF method from Vanderburg and Johnson 2014. The paper introduces a method for "Self Flat Fielding", by tracking how the lightcurve changes with motion of the spacecraft.

In [1]:
from pyke import KeplerTargetPixelFile
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import numpy as np
import matplotlib.pyplot as plt

In [2]:
#from oktopus import UniformPrior, JointPrior
#from pyke import PRFPhotometry, SceneModel
#from pyke.utils import KeplerQualityFlags


## Get data¶

In [3]:
#! wget https://www.cfa.harvard.edu/~avanderb/k2/ep60021426alldiagnostics.csv

In [4]:
import pandas as pd

In [5]:
df = pd.read_csv('ep60021426alldiagnostics.csv',index_col=False)

In [6]:
df.head()

Out[6]:
BJD - 2454833 Raw Flux Corrected Flux X-centroid Y-centroid arclength Correction Thrusters On
0 1862.502368 0.995119 0.995985 25.135097 24.661074 2.327480 0.999130 1.0
1 1862.522801 0.997313 0.996767 25.289752 24.418689 1.175322 1.000548 1.0
2 1862.543235 0.996713 0.996136 25.288052 24.429406 1.214627 1.000580 0.0
3 1862.563668 0.996930 0.996277 25.275216 24.448405 1.306617 1.000656 0.0
4 1862.584102 0.996862 0.996228 25.253864 24.480184 1.460259 1.000636 0.0

Let's use the provided $x-y$ centroids, but we could compute these on our own too.

In [7]:
col = df[' X-centroid'].values
col = col - np.mean(col)
row = df[' Y-centroid'].values
row = row - np.mean(row)

In [8]:
def _get_eigen_vectors(centroid_col, centroid_row):
centroids = np.array([centroid_col, centroid_row])
eig_val, eig_vec = np.linalg.eigh(np.cov(centroids))
return eig_val, eig_vec

In [9]:
def _rotate(eig_vec, centroid_col, centroid_row):
centroids = np.array([centroid_col, centroid_row])
return np.dot(eig_vec, centroids)

In [10]:
eig_val, eig_vec = _get_eigen_vectors(col, row)

In [11]:
v1, v2 = eig_vec


The major axis is the last one.

In [12]:
plt.figure(figsize=(5, 6))
plt.plot(col*4.0, row*4.0, 'ko', ms=4)
plt.plot(col*4.0, row*4.0, 'ro', ms=1)
plt.xticks([-2, -1,0, 1, 2])
plt.yticks([-2, -1,0, 1, 2])
plt.xlabel('X position [arcseconds]')
plt.ylabel('Y position [arcseconds]')
plt.xlim(-2, 2)
plt.ylim(-2, 2)
plt.plot([0, v1[0]], [0, v1[1]], color='blue', lw=3)
plt.plot([0, v2[0]], [0, v2[1]], color='blue', lw=3);


Following the form of Figure 2 of Vanderburg & Johsnon 2014.

In [13]:
rot_colp, rot_rowp = _rotate(eig_vec, col, row)


You can rotate into the new reference frame.

In [14]:
plt.figure(figsize=(5, 6))
plt.plot(rot_rowp*4.0, rot_colp*4.0, 'ko', ms=4)
plt.plot(rot_rowp*4.0, rot_colp*4.0, 'ro', ms=1)
plt.xticks([-2, -1,0, 1, 2])
plt.yticks([-2, -1,0, 1, 2])
plt.xlabel("X' position [arcseconds]")
plt.ylabel("Y' position [arcseconds]")
plt.xlim(-2, 2)
plt.ylim(-2, 2)
plt.plot([0, 1], [0, 0], color='blue')
plt.plot([0, 0], [0, 1], color='blue');


We need to calculate the arclength using: $$s= \int_{x'_0}^{x'_1}\sqrt{1+\left( \frac{dy'_p}{dx'}\right)^2} dx'$$

where $x^\prime_0$ is the transformed $x$ coordinate of the point with the smallest $x^\prime$ position, and $y^\prime_p$ is the best--fit polynomial function.

In [15]:
z = np.polyfit(rot_rowp, rot_colp, 5)
p5 = np.poly1d(z)
p5_deriv = p5.deriv()

In [16]:
x0_prime = np.min(rot_rowp)
xmax_prime = np.max(rot_rowp)

In [17]:
x_dense = np.linspace(x0_prime, xmax_prime, 2000)

In [18]:
plt.plot(rot_rowp, rot_colp, '.')
plt.plot(x_dense, p5(x_dense));

In [19]:
@np.vectorize
def arclength(x):
'''Input x1_prime, get out arclength'''
gi = x_dense <x
s_integrand = np.sqrt(1 + p5_deriv(x_dense[gi]) ** 2)
s = np.trapz(s_integrand, x=x_dense[gi])
return s

In [20]:
plt.plot(df[' arclength'], arclength(rot_rowp)*4.0, '.')
plt.plot([0, 4], [0, 4], 'k--');


It works!

Now we apply a high-pass filter. We follow the original paper by using BSplines with 1.5 day breakpoints.

In [21]:
from scipy.interpolate import BSpline
from scipy import interpolate

In [22]:
tt, ff = df['BJD - 2454833'].values, df[' Raw Flux'].values
tt = tt - tt[0]

In [23]:
knots = np.arange(0, tt[-1], 1.5)

In [24]:
t,c,k = interpolate.splrep(tt, ff, s=0, task=-1, t=knots[1:])

In [25]:
bspl = BSpline(t,c,k)

plt.plot(tt, ff, '.')
plt.plot(tt, bspl(tt))

Out[25]:
[<matplotlib.lines.Line2D at 0xa1fddacf8>]

Spline fit looks good, so normalize the flux by the long-term trend.
Plot the normalized flux versus arclength to see the position-dependent flux.

In [26]:
norm_ff = ff/bspl(tt)


Mask the data by keeping only the good samples.

In [27]:
bi = df[' Thrusters On'].values == 1.0
gi = df[' Thrusters On'].values == 0.0
al, gff = arclength(rot_rowp[gi])*4.0, norm_ff[gi]

In [28]:
sorted_inds = np.argsort(al)


We will follow the paper by interpolating 15 bins of means. This is a piecewise linear fit.

In [29]:
knots = np.array([np.min(al)]+
[np.median(splt) for splt in np.array_split(al[sorted_inds], 15)]+
[np.max(al)])

In [30]:
bin_means = np.array([gff[sorted_inds][0]]+
[np.mean(splt) for splt in np.array_split(gff[sorted_inds], 15)]+
[gff[sorted_inds][-1]])

In [31]:
zz = np.polyfit(al, gff,6)
sff = np.poly1d(zz)
al_dense = np.linspace(0, 4, 1000)
interp_func = interpolate.interp1d(knots, bin_means)

In [32]:
plt.figure(figsize=(5, 6))
plt.plot(arclength(rot_rowp)*4.0, norm_ff, 'ko', ms=4)
plt.plot(arclength(rot_rowp)*4.0, norm_ff, 'o', color='#3498db', ms=3)
plt.plot(arclength(rot_rowp[bi])*4.0, norm_ff[bi], 'o', color='r', ms=3)
#plt.plot(al_dense, sff(al_dense), '-', color='#e67e22')
#plt.plot(knots, bin_means, '-', color='#e67e22')
plt.plot(np.sort(al), interp_func(np.sort(al)), '-', color='#e67e22')

plt.xticks([0, 1,2, 3, 4])
plt.xlabel('Arclength [arcseconds]')
plt.ylabel('Relative Brightness')
plt.title('EPIC 60021426, Kp =10.3')
plt.xlim(0,4)
plt.ylim(0.997, 1.002);


Following Figure 4 of Vanderburg & Johnson 2014.

Apply the Self Flat Field (SFF) correction:

In [33]:
corr_flux = gff / interp_func(al)

In [34]:
plt.figure(figsize=(10,6))

dy = 0.004
plt.plot(df['BJD - 2454833'], df[' Raw Flux']+dy, 'ko', ms=4)
plt.plot(df['BJD - 2454833'], df[' Raw Flux']+dy, 'o', color='#3498db', ms=3)
plt.plot(df['BJD - 2454833'][bi], df[' Raw Flux'][bi]+dy, 'o', color='r', ms=3)

plt.plot(df['BJD - 2454833'][gi], corr_flux*bspl(tt[gi]), 'o', color='k', ms = 4)
plt.plot(df['BJD - 2454833'][gi], corr_flux*bspl(tt[gi]), 'o', color='#e67e22', ms = 3)

#plt.plot(df['BJD - 2454833'][gi], df[' Corrected Flux'][gi], 'o', color='#00ff00', ms = 4)

plt.xlabel('BJD - 2454833')
plt.ylabel('Relative Brightness')

plt.xlim(1862, 1870)
plt.ylim(0.994, 1.008);


Following Figure 5 of Vanderburg & Johnson 2015.

Let's compute the CDPP:

In [35]:
from pyke import LightCurve

In [36]:
#lc = LightCurve(time=df['BJD - 2454833'][gi], flux=corr_flux*bspl(tt[gi]))
lc = LightCurve(time=df['BJD - 2454833'][gi], flux=df[' Corrected Flux'][gi])

In [37]:
lc.cdpp(savgol_window=201)*1.4

Out[37]:
38.872509891106382

The end.