Walkthrough demonstrating how to simulate and fit a damped random walk (DRW) light curve, using the DRW model kernel in celerite:
k(tnm)=aexp(−c tnm)which corresponds to the structure function:
SF2=2σ2(1−e−tnm/τDRW)where τDRW=1c and σ=√a/2.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from astropy import units as u
plt.rcParams['axes.linewidth'] = 1.5
Import package methods to simulate and fit DRW
from taufit.taufit import simulate_drw
from taufit.taufit import fit_drw, fit_carma
#num_seasons = 6
#season_duration = 120 # days
cadence = 7 # days
#x = [np.arange(365*i, 365*i + season_duration, cadence) for i in range(num_seasons)]
#x = np.array(x).flatten()
x = np.arange(0, 2000, cadence)
Define function to simulate DRW light curves with similar amplitude fluctuations at a given timescale, typical mean magnitude, and error bars, with Gaussian white noise.
def simulate_and_fit_drw(x, log_tau_drw, sigma=0.05, ymean=20, ysigma=0.02, plot=False, verbose=False):
tau = 10**log_tau_drw
# Simulate DRW
#sigma_hat = sigma/np.sqrt(tau/250)
y = simulate_drw(x, tau=tau, sigma=sigma, ymean=ymean, size=1)
y = y[0, :] # The first dimension is each sample
yerr = np.full(np.shape(y), ysigma)
# Add RMS fluctuations
y += np.random.normal(0, ysigma, len(y))
# Fit DRW
# Note the inputs to fit_drw must have astropy units
gp, samples, fig = fit_drw(x*u.day, y*u.mag, yerr*u.mag, plot=plot, verbose=verbose)
log_tau_drw_recovered = np.log10(1/np.exp(np.median(samples[:,1])))
return x, y, yerr, gp, samples
Simulate τDRW=100 days:
x, y, yerr, gp, samples = simulate_and_fit_drw(x, np.log10(100), plot=True, verbose=True)
Initial log-likelihood: 555.3549572672989 Final log-likelihood: 562.58745590685 Running burn-in...
/home/colinjb2/.local/lib/python3.6/site-packages/emcee/moves/red_blue.py:99: RuntimeWarning: invalid value encountered in double_scalars lnpdiff = f + nlp - state.log_prob[j]
Running production...
Shown above: The power spectrum (top left), light curve and posterior model prediction 1σ error ellipses (top right), τDRW posterior (bottom left) with 16th and 84th percentiles marked as dashed lines, and the autocorrelation function of the squared residuals (ACF χ2; bottom right). The ACF χ2 should located within the grey region (consistent with white noise).
Important: The grey shaded region in the τDRW posterior plot (bottom left panel) corresponds to > 20% of the light curve baseline. If the distribution peaks in this region, the light curve is not long enough and the result is unreliable (see Kozlowski 2017).
Use corner to plot the posteriors with celerite's native variables or the equilivlant structure-function-convention variable:
import corner
# Note these are natural logs
fig = corner.corner(samples[:,:2], quantiles=[0.16,0.84], show_titles=True,
labels=[r"$\ln\ a$", r"$\ln\ c$"],
label_kwargs=dict(fontsize=18), title_kwargs=dict(fontsize=16));
plt.show()
# Structure function variables
# sigma = sqrt(a/2), tau_DRW = 1/c, SFinf = sigma_hat*np.sqrt(tau/250)
samples_sf = [np.log(np.sqrt(np.exp(samples[:,0]/2))), np.log(1/np.exp(samples[:,1]))]
samples_sf = np.array(samples_sf).T
fig = corner.corner(samples_sf, quantiles=[0.16,0.84], show_titles=True,
labels=[r"$\ln\ \sigma$", r"$\ln\ \tau_{\rm{DRW}}$"],
label_kwargs=dict(fontsize=18), title_kwargs=dict(fontsize=16));
Celerite has been shown to be ∼10× faster than codes such as carma_pack, while having equivilent model PSDs to CARMA (Foreman-Mackey et al. 2017).
Posterior prediction at future times:
t = np.arange(1750, 2500, 1)
# Use just as you would with celerite
s = np.median(samples, axis=0)
gp.set_parameter_vector(s)
mu, var = gp.predict(y, t, return_var=True)
std = np.sqrt(var)
# Plot
fig, ax = plt.subplots(1,1, figsize=(12,5))
ax.errorbar(x, y, yerr=yerr, c='k', fmt='.', alpha=0.75, elinewidth=1)
ax.fill_between(t, mu+std, mu-std, color="#ff7f0e", alpha=0.3, label='posterior prediction')
ax.set_xlabel('Time (MJD)',fontsize=18)
ax.set_ylabel(r'Magnitude',fontsize=18)
ax.tick_params(labelsize=18)
ax.set_ylim(np.max(y) + .1, np.min(y) - .1)
ax.set_xlim(np.min(t), np.max(t))
ax.legend(fontsize=16, loc=1)
<matplotlib.legend.Legend at 0x7fb1598f9128>
The kernel function for a general CARMA(p,q) model is
k(tnm)=p∑j=1Ajexp(rjtnm).In celerite, this corresponds to a summation of ComplexTerms
:
where each CARMA term corresponds to a celerite term with aj=2 Re(Aj), bj=2 Im(Aj), cj=−Re(rj), dj=−Im(rj) (Foreman-Mackey et al. 2017; Footnote 17). Here, CARMA(p,q) the order is p=J and q=p−1. Note, there is no way to explicitly constrain q to be less than p−1 using the celerite solver, so we must adopt this general form.
Summing to J, we can easily see that if p=1,q=0, we recover the DRW kernel (after noting that the imaginary terms bj=dj=0):
=1∑j=112[(aj+ibj)e−(cj+idj)tnm+(aj−ibj)e−(cj−idj)tnm]Note there is a typo in the published version of the paper where the summations are shown incorrectly to sum to 2J.
# Note we have four parameters (a_j, b_j, c_j, d_j) for each p term
gp, samples, _ = fit_carma(x*u.day, y*u.mag, yerr*u.mag, p=2, init=[-4,-6,-5,-4,-4,-6,-5,-4,-5])
Initial log-likelihood: -420.12008849480895 Running burn-in... Running production...
corner.corner(samples, quantiles=[0.16,0.84], show_titles=True)
We recommend you set the bounds
and init
arguments yourself for better results. Note that the parameters of higher order CARMA-like models quickly become difficult to physically interpret.
Equivilance to CARMA parameters (coming soon?)
Generate random DRW light curves with a range of input τDRW, both seasonally-gapped and not, and see how well we can recover the input for each.
You may togggle whether to plot the output or not by changing the default arugments in simulate_and_fit_drw
above.
# Modify the function above to create gaps in the light curve
def simulate_gapped(x, log_tau_drw, sigma=0.2, ymean=20, ysigma=0.02, plot=False, verbose=False):
tau = 10**log_tau_drw
# Simulate DRW
y = simulate_drw(x, tau=tau, sigma=sigma, ymean=ymean, size=1)
y = y[0, :] # The first dimension is each sample
yerr = np.full(np.shape(y), ysigma)
# Add RMS fluctuations
y += np.random.normal(0, ysigma, len(y))
# Note the inputs to fit_drw must have astropy units
gp, samples = fit_drw(x*u.day, y*u.mag, yerr*u.mag, plot=plot, verbose=verbose, supress_warn=True)
log_tau_drw_recovered = np.log10(1/np.exp(np.median(samples[:,1])))
# Gapped light curve
x_gap = []; y_gap = []; yerr_gap = []
days = 0
for i in range(6):
ind = (days<x) & (x<days+120)
x_gap.append(x[ind])
y_gap.append(y[ind])
yerr_gap.append(yerr[ind])
days += 365
x_gap = np.concatenate(x_gap)
y_gap = np.concatenate(y_gap)
yerr_gap = np.concatenate(yerr_gap)
# Note the inputs to fit_drw must have astropy units
gp, samples = fit_drw(x_gap*u.day, y_gap*u.mag, yerr_gap*u.mag, plot=plot, verbose=verbose, nburn=50, nsamp=200)
log_tau_drw_recovered2 = np.log10(1/np.exp(np.median(samples[:,1])))
#
return log_tau_drw, log_tau_drw_recovered, log_tau_drw_recovered2
from multiprocessing import Pool
# Generate input taus
x = np.arange(0,2000,7)
log_tau_drw_true = np.linspace(0.5, 5, 500)
# Use multiprocessing to pool each fitting to a simulated light curve
pool = Pool(20)
args = zip([x]*len(log_tau_drw_true), log_tau_drw_true)
taus = pool.starmap(simulate_gapped, args)
pool.close()
pool.join()
taus = np.array(taus)
Plot input τDRW vs. output τDRW. Note the confirmation of the bias reported by Kozlowski (2017) when the input τDRW approaches 10% of the light curve baseline.
fig,ax = plt.subplots(1,1,figsize=(6,5), sharey=True)
ax.scatter(taus[:,0], taus[:,1], s=10, c='k', alpha=0.7)
ax.set_xlim(0.5,5)
ax.set_ylim(0.5,5)
ax.plot([0.5,5],[0.5,5], lw=2, color='b', zorder=-1, linestyle='dashed')
ax.tick_params('both',labelsize=16)
ax.set_xlabel(r'True $\log_{10}\ \tau_{\rm{DRW}}$',fontsize=18)
ax.set_ylabel(r'Recovered $\log_{10}\ \tau_{\rm{DRW}}$',fontsize=18)
baseline = np.max(x) - np.min(x)
ax.vlines(np.log10(cadence),0.5,5, color='grey', lw=3, zorder=-1)
ax.hlines(np.log10(cadence),0.5,5, color='grey', lw=3, zorder=-1)
ax.vlines(np.log10(0.1*baseline),0.5,5, color='grey', lw=3, zorder=-1)
ax.hlines(np.log10(0.1*baseline),0.5,5, color='grey', lw=3, zorder=-1)
ax.text(0.6, 4.7, 'non-gapped', fontsize=16)
fig.tight_layout()
fig,ax = plt.subplots(1,1,figsize=(6,5), sharey=True)
ax.scatter(taus[:,0], taus[:,2], s=10, c='k', alpha=0.7)
ax.set_xlim(0.5,5)
ax.set_ylim(0.5,5)
ax.plot([0.5,5],[0.5,5], lw=2, color='b', zorder=-1, linestyle='dashed')
ax.tick_params('both',labelsize=16)
ax.set_xlabel(r'True $\log_{10}\ \tau_{\rm{DRW}}$',fontsize=18)
ax.set_ylabel(r'Recovered $\log_{10}\ \tau_{\rm{DRW}}$',fontsize=18)
baseline = np.max(x) - np.min(x)
ax.vlines(np.log10(cadence),0.5,5, color='grey', lw=3, zorder=-1)
ax.hlines(np.log10(cadence),0.5,5, color='grey', lw=3, zorder=-1)
ax.vlines(np.log10(0.1*baseline),0.5,5, color='grey', lw=3, zorder=-1)
ax.hlines(np.log10(0.1*baseline),0.5,5, color='grey', lw=3, zorder=-1)
ax.text(0.6, 4.7, 'gapped', fontsize=16)
fig.tight_layout()