This notebook describes how to use the astrobase modules checkplot and lcproc to post-process a collection of light curves. We'll go through all the various features one can add to checkplots to make it easier to visualize and classify objects in a light curve collection, especially when the checkplotserver webapp is used.
We'll be using a bunch of text light curves I've collected from the HATPI prototype's observations of the Galactic Center in 2015, and more specifically of the open cluster IC4725. I'll go through the following topics:
# let's look at our light curves
!ls
IC4725-catalog.pkl notebook.tex IC4725-hatpi-observations-catalog.txt output_53_0.png checkplots output_55_0.png csvlcs period-finding lc-collection-work.ipynb pklcs lsp-plot.png xmatch-cats magseries-phased-plot.png
# to make lists of light curve files
import glob
# we'll use the CSV light curves here
lclist = glob.glob('csvlcs/*.csv')
lclist[:10]
['csvlcs/HAT-529-0001333-hplc.csv', 'csvlcs/HAT-529-0002995-hplc.csv', 'csvlcs/HAT-529-0003275-hplc.csv', 'csvlcs/HAT-529-0003381-hplc.csv', 'csvlcs/HAT-529-0004688-hplc.csv', 'csvlcs/HAT-529-0004752-hplc.csv', 'csvlcs/HAT-529-0005160-hplc.csv', 'csvlcs/HAT-529-0006870-hplc.csv', 'csvlcs/HAT-529-0007026-hplc.csv', 'csvlcs/HAT-529-0009125-hplc.csv']
!head csvlcs/HAT-529-0004752-hplc.csv
rjd frk xcc ycc irm ire irq iep itf 57170.64095 1-419607a_8 707.401 1606.373 -0.01519 0.01102 G -0.01483 -0.00417 57170.64139 1-419607b_8 707.407 1606.332 0.00788 0.01118 G 0.00852 0.01224 57170.64194 1-419607c_8 707.304 1606.379 -0.01655 0.01106 G -0.00913 0.00846 57170.64246 1-419607d_8 707.387 1606.375 0.00946 0.01125 G 0.01078 0.02185 57170.64298 1-419607e_8 707.353 1606.401 0.00038 0.01101 G 0.00412 0.01697 57170.64347 1-419607f_8 707.336 1606.446 -0.03331 0.01119 G -0.02867 -0.00639 57170.64511 1-419608a_8 700.444 1606.361 -0.00102 0.01196 G -0.00372 0.00029 57170.64557 1-419608b_8 700.431 1606.381 0.00768 0.01198 G 0.00382 0.01680 57170.64610 1-419608c_8 700.429 1606.422 0.04160 0.01173 G 0.03669 0.04098
# This looks like a straightforward text format light curve.
# the columns are:
# rjd - reduced Julian date (JD - 2400000.0)
# frk - a frame key to uniquely identify each observation
# xcc - x coordinate of the object on the CCD
# ycc - y coordinate of the obejct on the CCD
# irm - the magnitude of the object (this appears to be normalized to zero here)
# ire - the associated error in the magnitude
# irq - some sort of measurement flag ('G' appears to mean all good)
# iep - the EPD magnitude
# itf - the TFA magnitude
# Let's write a function to read the light curve into a Python dict,
# which should be readable by all lcproc and checkplot functions
# we'll use numpy.genfromtxt to read these light curves
import numpy as np
# Let's first look at how to get custom LC formats into lcproc
from astrobase import lcproc
# we use the lcproc.register_custom_lcformat function for this
help(lcproc.register_lcformat)
Help on function register_lcformat in module astrobase.lcproc: register_lcformat(formatkey, fileglob, timecols, magcols, errcols, readerfunc_module, readerfunc, readerfunc_kwargs=None, normfunc_module=None, normfunc=None, normfunc_kwargs=None, magsarefluxes=False, overwrite_existing=False, lcformat_dir='~/.astrobase/lcformat-jsons') This adds a new LC format to the astrobase LC format registry. Allows handling of custom format light curves for astrobase lcproc drivers. Once the format is successfully registered, light curves should work transparently with all of the functions in this module, by simply calling them with the `formatkey` in the `lcformat` keyword argument. LC format specifications are generated as JSON files. astrobase comes with several of these in `<astrobase install path>/data/lcformats`. LC formats you add by using this function will have their specifiers written to the `~/.astrobase/lcformat-jsons` directory in your home directory. Parameters ---------- formatkey : str A str used as the unique ID of this LC format for all lcproc functions and can be used to look it up later and import the correct functions needed to support it for lcproc operations. For example, we use 'kep-fits' as a the specifier for Kepler FITS light curves, which can be read by the `astrobase.astrokep.read_kepler_fitslc` function as specified by the `<astrobase install path>/data/lcformats/kep-fits.json` LC format specification JSON produced by `register_lcformat`. fileglob : str The default UNIX fileglob to use to search for light curve files in this LC format. This is a string like '*-whatever-???-*.*??-.lc'. timecols,magcols,errcols : list of str These are all lists of strings indicating which keys in the lcdict produced by your `lcreader_func` that will be extracted and used by lcproc functions for processing. The lists must all have the same dimensions, e.g. if timecols = ['timecol1','timecol2'], then magcols must be something like ['magcol1','magcol2'] and errcols must be something like ['errcol1', 'errcol2']. This allows you to process multiple apertures or multiple types of measurements in one go. Each element in these lists can be a simple key, e.g. 'time' (which would correspond to lcdict['time']), or a composite key, e.g. 'aperture1.times.rjd' (which would correspond to lcdict['aperture1']['times']['rjd']). See the examples in the lcformat specification JSON files in `<astrobase install path>/data/lcformats`. readerfunc_module : str This is either: - a Python module import path, e.g. 'astrobase.lcproc.catalogs' or - a path to a Python file, e.g. '/astrobase/hatsurveys/hatlc.py' that contains the Python module that contains functions used to open (and optionally normalize) a custom LC format that's not natively supported by astrobase. readerfunc : str This is the function name in `readerfunc_module` to use to read light curves in the custom format. This MUST always return a dictionary (the 'lcdict') with the following signature (the keys listed below are required, but others are allowed):: {'objectid': this object's identifier as a string, 'objectinfo':{'ra': this object's right ascension in decimal deg, 'decl': this object's declination in decimal deg, 'ndet': the number of observations in this LC, 'objectid': the object ID again for legacy reasons}, ...other time columns, mag columns go in as their own keys} normfunc_kwargs : dict or None This is a dictionary containing any kwargs to pass through to the light curve norm function. normfunc_module : str or None This is either: - a Python module import path, e.g. 'astrobase.lcproc.catalogs' or - a path to a Python file, e.g. '/astrobase/hatsurveys/hatlc.py' - None, in which case we'll use default normalization that contains the Python module that contains functions used to normalize a custom LC format that's not natively supported by astrobase. normfunc : str or None This is the function name in `normfunc_module` to use to normalize light curves in the custom format. If None, the default normalization method used by lcproc is to find gaps in the time-series, normalize measurements grouped by these gaps to zero, then normalize the entire magnitude time series to global time series median using the `astrobase.lcmath.normalize_magseries` function. If this is provided, the normalization function should take and return an lcdict of the same form as that produced by `readerfunc` above. For an example of a specific normalization function, see `normalize_lcdict_by_inst` in the `astrobase.hatsurveys.hatlc` module. normfunc_kwargs : dict or None This is a dictionary containing any kwargs to pass through to the light curve normalization function. magsarefluxes : bool If this is True, then all lcproc functions will treat the measurement columns in the lcdict produced by your `readerfunc` as flux instead of mags, so things like default normalization and sigma-clipping will be done correctly. If this is False, magnitudes will be treated as magnitudes. overwrite_existing : bool If this is True, this function will overwrite any existing LC format specification JSON with the same name as that provided in the `formatkey` arg. This can be used to update LC format specifications while keeping the `formatkey` the same. lcformat_dir : str This specifies the directory where the the LC format specification JSON produced by this function will be written. By default, this goes to the `.astrobase/lcformat-jsons` directory in your home directory. Returns ------- str Returns the file path to the generated LC format specification JSON file.
# So it looks like we need something like the following to provide
# to the register_custom_lcformat function
# we'll call these 'hplc' format light curves
formatkey = 'hplc'
# the file glob used to recognized these light curves in a directory listing
fileglob = 'HAT-*-hplc.csv'
# the module to get the readerfunc from - we'll generate this here
# readerfunc_module = 'lcreadermodule.py'
# this is the readerfunc we'll make below, let's skip this one for now
# readerfunc = read_csv_lightcurve
# the list of time columns in the light curve dict we want to process
# this is just 'rjd' for this LC format
timecols = ['rjd']
# the list of magnitude/flux columns in the light curve we want to process
# let's use the EPD magnitudes only
magcols = ['iep']
# the list of err columns associated with each mag column
# this is 'ire' for this LC format
errcols = ['ire']
# we're using magnitudes so set this to False
magsarefluxes = False
# we're not going to normalize our light curve in some special way
# so set this to None
normfunc_module, normfunc, normfunc_kwargs = None, None, None
# Now, let's make the function to read in a light curve.
# This will return a dict as expected by lcproc functions.
# The dict will be barebones at first, but we'll use the catalog
# file to add in extra information for each object, such as RA, Dec,
# and magnitudes in various bandpasses. This information can then be
# transparently used by checkplot functions.
import os.path
# First, the LC reader function
def read_csv_lightcurve(lcfile):
'''This reads the light curve lcfile.
'''
# read the LC into a numpy recarray
recarr = np.genfromtxt(lcfile,
usecols=(0,1,2,3,4,5,6,7,8),
names=['rjd','frk','xcc','ycc','irm','ire','irq','iep','itf'],
dtype='f8,U20,f8,f8,f8,f8,S20,f8,f8',
skip_header=1)
# generate the objectid
# here, we're basically stripping the filename to form the objectid
objectid = os.path.splitext(os.path.basename(lcfile))[0].rstrip('-hplc')
# generate a barebones objectinfo
objectinfo = {'objectid': objectid}
# this is the lcdict we need
lcdict = {'objectid': objectid,
'objectinfo':objectinfo,
'rjd':recarr['rjd'],
'iep':recarr['iep'],
'ire':recarr['ire']}
return lcdict
# Let's test this function out
lcdict = read_csv_lightcurve('csvlcs/HAT-529-0004752-hplc.csv')
# Looks good
lcdict
{'iep': array([-0.01483, 0.00852, -0.00913, ..., 0.01659, -0.05152, -0.02914]), 'ire': array([ 0.01102, 0.01118, 0.01106, ..., 0.01081, 0.01076, 0.01018]), 'objectid': 'HAT-529-0004752', 'objectinfo': {'objectid': 'HAT-529-0004752'}, 'rjd': array([ 57170.64095, 57170.64139, 57170.64194, ..., 57274.63173, 57274.63222, 57274.63355])}
# Let's write the LC reader function to a module so we can import it
%%writefile ./lcreadermodule.py
#!/usr/bin/env python
import os.path
# First, the LC reader function
def read_csv_lightcurve(lcfile):
'''This reads the light curve lcfile.
'''
# read the LC into a numpy recarray
recarr = np.genfromtxt(lcfile,
usecols=(0,1,2,3,4,5,6,7,8),
names=['rjd','frk','xcc','ycc','irm','ire','irq','iep','itf'],
dtype='f8,U20,f8,f8,f8,f8,S20,f8,f8',
skip_header=1)
# generate the objectid
# here, we're basically stripping the filename to form the objectid
objectid = os.path.splitext(os.path.basename(lcfile))[0].rstrip('-hplc')
# generate a barebones objectinfo
objectinfo = {'objectid': objectid}
# this is the lcdict we need
lcdict = {'objectid': objectid,
'objectinfo':objectinfo,
'rjd':recarr['rjd'],
'iep':recarr['iep'],
'ire':recarr['ire']}
return lcdict
# Let's register this LC format now
lcproc.register_lcformat(formatkey, fileglob,
timecols, magcols, errcols,
'./lcreadermodule.py', 'read_csv_lightcurve',
magsarefluxes=magsarefluxes)
[2018-03-20T00:31:19Z - INFO] added hplc to registry
# Let's make sure it registered fine
lcproc.get_lcformat('hplc')
['HAT-*-hplc.csv', <function __main__.read_csv_lightcurve>, ['rjd'], ['iep'], ['ire'], False, None]
# Many checkplot and lcproc functions require at least an
# objectid, RA, Dec, and some some sort of magnitude.
# We have a catalog file in this directory, which we can
# use to add this info to the light curve
!ls *.txt
IC4725-hatpi-observations-catalog.txt
!head IC4725-hatpi-observations-catalog.txt
ndet| hatid| network| stations| objectid| ra| decl| jmag| jmag_err| hmag| hmag_err| kmag| kmag_err| bmag| vmag| rmag| imag| sdssu| sdssg| sdssr| sdssi| sdssz 15507| HAT-529-0001333| HP| ['HP1']| HAT-529-0001333| 278.06984| -18.74239| 7.26500| 0.03000| 6.40500| 0.02400| 6.10700| 0.02300| 12.41700| 10.66700| 9.67500| 8.69300| 14.15000| 11.31000| 9.83400| 9.27200| 8.81600 15487| HAT-529-0002995| HP| ['HP1']| HAT-529-0002995| 277.41904| -18.67429| 8.30000| 0.02100| 7.67200| 0.03400| 7.48700| 0.02300| 11.98200| 10.69900| 9.99900| 9.30800| 13.71700| 11.32600| 10.25000| 9.88100| 9.58000 15510| HAT-529-0003275| HP| ['HP1']| HAT-529-0003275| 278.31415| -18.68220| 9.13300| 0.02900| 9.06400| 0.02600| 8.99400| 0.02300| 9.94400| 9.69400| 9.53300| 9.36500| 11.58800| 10.04600| 9.81300| 9.85100| 9.88700 15510| HAT-529-0003381| HP| ['HP1']| HAT-529-0003381| 277.75128| -18.50968| 7.58500| 0.01800| 6.56800| 0.03100| 6.13300| 0.02400| 13.99300| 11.91200| 10.64100| 9.38600| 15.40200| 12.37300| 10.61400| 9.88600| 9.33700 15513| HAT-529-0004688| HP| ['HP1']| HAT-529-0004688| 278.18946| -18.66950| 7.96700| 0.02700| 7.00100| 0.03100| 6.62800| 0.02400| 13.89300| 11.93300| 10.76800| 9.62100| 15.36600| 12.46200| 10.81400| 10.14400| 9.63300 15511| HAT-529-0004752| HP| ['HP1']| HAT-529-0004752| 278.08537| -18.45812| 8.22800| 0.02400| 7.42300| 0.03400| 7.10100| 0.02300| 13.25100| 11.59700| 10.60700| 9.63200| 14.73700| 12.08300| 10.68900| 10.14400| 9.73200 15503| HAT-529-0005160| HP| ['HP1']| HAT-529-0005160| 277.90962| -18.62271| 8.40000| 0.03500| 7.60900| 0.04600| 7.35000| 0.02400| 13.09400| 11.49400| 10.58700| 9.69500| 14.69700| 12.09400| 10.75800| 10.24900| 9.85100 15512| HAT-529-0006870| HP| ['HP1']| HAT-529-0006870| 278.19234| -18.50681| 8.91900| 0.02100| 8.27700| 0.03600| 8.11300| 0.02100| 12.57300| 11.28100| 10.58800| 9.91100| 14.25600| 11.92800| 10.85800| 10.48400| 10.18500 15510| HAT-529-0007026| HP| ['HP1']| HAT-529-0007026| 278.09684| -18.70412| 9.71800| 0.02300| 9.62300| 0.02100| 9.58500| 0.02100| 10.50400| 10.23000| 10.08000| 9.93000| 12.13400| 10.63700| 10.39900| 10.43000| 10.46200
# Let's look at the catalog entry for the object
# we just read in the LC for
!grep HAT-529-0004752 IC4725-hatpi-observations-catalog.txt
15511| HAT-529-0004752| HP| ['HP1']| HAT-529-0004752| 278.08537| -18.45812| 8.22800| 0.02400| 7.42300| 0.03400| 7.10100| 0.02300| 13.25100| 11.59700| 10.60700| 9.63200| 14.73700| 12.08300| 10.68900| 10.14400| 9.73200
# Now, we can write a function to read in the catalog
# and convert it into a fast look up table to add each
# light curve object's information into the light curve
# dict we generate using read_csv_lightcurve
def read_object_catalog(catalogfile):
'''
This reads the catalog into a recarray.
'''
catarr = np.genfromtxt(catalogfile,
names=True,
delimiter='|',
dtype='i8,U20,U20,U20,U20,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8,f8',
autostrip=True)
return catarr
# See if it works fine
objcat = read_object_catalog('IC4725-hatpi-observations-catalog.txt')
# see if our object's in there
'HAT-529-0004752' in objcat['objectid']
True
# Look up our object
objcat_ind = objcat['objectid'] == 'HAT-529-0004752'
objcat[objcat_ind]
array([ (15511, 'HAT-529-0004752', 'HP', "['HP1']", 'HAT-529-0004752', 278.08537, -18.45812, 8.228, 0.024, 7.423, 0.034, 7.101, 0.023, 13.251, 11.597, 10.607, 9.632, 14.737, 12.083, 10.689, 10.144, 9.732)], dtype=[('ndet', '<i8'), ('hatid', '<U20'), ('network', '<U20'), ('stations', '<U20'), ('objectid', '<U20'), ('ra', '<f8'), ('decl', '<f8'), ('jmag', '<f8'), ('jmag_err', '<f8'), ('hmag', '<f8'), ('hmag_err', '<f8'), ('kmag', '<f8'), ('kmag_err', '<f8'), ('bmag', '<f8'), ('vmag', '<f8'), ('rmag', '<f8'), ('imag', '<f8'), ('sdssu', '<f8'), ('sdssg', '<f8'), ('sdssr', '<f8'), ('sdssi', '<f8'), ('sdssz', '<f8')])
%%writefile ./lcreadermodule.py
#!/usr/bin/env python
# Now, we can amend read_csv_lightcurve to require
# this loaded catalog and automatically add extra
# object info to the lcdict it produces
def read_csv_lightcurve(lcfile,
catfile='IC4725-hatpi-observations-catalog.txt',
infokeys=['objectid','ra','decl','jmag','hmag','kmag',
'bmag','vmag','rmag','imag',
'sdssu','sdssg','sdssr','sdssi','sdssz']):
'''This reads the light curve lcfile.
'''
# read the LC into a numpy recarray
recarr = np.genfromtxt(lcfile,
usecols=(0,1,2,3,4,5,6,7,8),
names=['rjd','frk','xcc','ycc','irm','ire','irq','iep','itf'],
dtype='f8,U20,f8,f8,f8,f8,S20,f8,f8',
skip_header=1)
# generate the objectid
# here, we're basically stripping the filename to form the objectid
objectid = os.path.splitext(os.path.basename(lcfile))[0].rstrip('-hplc')
# read in the catalog
catalog_recarray = read_object_catalog(catfile)
# look up this object in the catalog
objind = catalog_recarray['objectid'] == objectid
# if we find this object, add its info
if objind.size > 0:
objectinfo = {x:np.asscalar(catalog_recarray[x][objind]) for x in infokeys}
# otherwise, generate a barebones objectinfo
else:
objectinfo = {'objectid': objectid}
# this is the lcdict we need
lcdict = {'objectid': objectid,
'objectinfo':objectinfo,
'rjd':recarr['rjd'],
'iep':recarr['iep'],
'ire':recarr['ire']}
return lcdict
# test this new reader function
lcdict = read_csv_lightcurve('csvlcs/HAT-529-0004752-hplc.csv')
lcdict
{'iep': array([-0.01483, 0.00852, -0.00913, ..., 0.01659, -0.05152, -0.02914]), 'ire': array([ 0.01102, 0.01118, 0.01106, ..., 0.01081, 0.01076, 0.01018]), 'objectid': 'HAT-529-0004752', 'objectinfo': {'bmag': 13.251, 'decl': -18.45812, 'hmag': 7.423, 'imag': 9.632, 'jmag': 8.228, 'kmag': 7.101, 'objectid': 'HAT-529-0004752', 'ra': 278.08537, 'rmag': 10.607, 'sdssg': 12.083, 'sdssi': 10.144, 'sdssr': 10.689, 'sdssu': 14.737, 'sdssz': 9.732, 'vmag': 11.597}, 'rjd': array([ 57170.64095, 57170.64139, 57170.64194, ..., 57274.63173, 57274.63222, 57274.63355])}
# now we have all of the info we need
lcdict['objectinfo']
{'bmag': 13.251, 'decl': -18.45812, 'hmag': 7.423, 'imag': 9.632, 'jmag': 8.228, 'kmag': 7.101, 'objectid': 'HAT-529-0004752', 'ra': 278.08537, 'rmag': 10.607, 'sdssg': 12.083, 'sdssi': 10.144, 'sdssr': 10.689, 'sdssu': 14.737, 'sdssz': 9.732, 'vmag': 11.597}
# let's re-register our readerfunc to make sure it's up to date
# here, we use readerfunc_kwargs to add in the kwargs the readerfunc needs to work properly
# we also set overwrite_existing = True to overwrite the previous LC format definition
lcproc.register_lcformat(formatkey, fileglob,
timecols, magcols, errcols,
'./lcreadermodule.py', 'read_csv_lightcurve',
readerfunc_kwargs={'catfile':''IC4725-hatpi-observations-catalog.txt',
'infokeys':['objectid','ra','decl','jmag','hmag','kmag',
'bmag','vmag','rmag','imag',
'sdssu','sdssg','sdssr','sdssi','sdssz']},
magsarefluxes=magsarefluxes,
overwrite_existing=True)
[2018-03-20T00:31:35Z - INFO] added hplc to registry
Next, we'll go through some of the LC collection processing functions available in lcproc, in preparation for running period-finding and checkplot operations on large numbers of light curves. We'll be able to use the read_csv_lightcurve function we wrote above transparently to read in and extract information for the light curves.
# To help with neighbor lookups for our objects,
# we can generate a catalog pickle for our light curve
# collection
# we'll use the light curves in our collection and
# the lcproc.catalogs.make_lclist function to generate a catalog
# pickle that has a kdtree for fast cone searching
# make_lclist can optionally also generate a finder chart
# using a FITS image associated with the object catalog
# if the FITS image has WCS information associated with it
from astrobase.lcproc.catalogs import make_lclist
help(make_lclist)
Help on function make_lclist in module astrobase.lcproc.catalogs: make_lclist(basedir, outfile, use_list_of_filenames=None, lcformat='hat-sql', lcformatdir=None, fileglob=None, recursive=True, columns=['objectid', 'objectinfo.ra', 'objectinfo.decl', 'objectinfo.ndet'], makecoordindex=('objectinfo.ra', 'objectinfo.decl'), field_fitsfile=None, field_wcsfrom=None, field_scale=<astropy.visualization.interval.ZScaleInterval object at 0x1099622b0>, field_stretch=<astropy.visualization.stretch.LinearStretch object at 0x109a14470>, field_colormap=<matplotlib.colors.LinearSegmentedColormap object at 0x1119af4a8>, field_findersize=None, field_pltopts={'marker': 'o', 'markersize': 10.0, 'markerfacecolor': 'none', 'markeredgewidth': 2.0, 'markeredgecolor': 'red'}, field_grid=False, field_gridcolor='k', field_zoomcontain=True, maxlcs=None, nworkers=8) This generates a light curve catalog for all light curves in a directory. Given a base directory where all the files are, and a light curve format, this will find all light curves, pull out the keys in each lcdict requested in the `columns` kwarg for each object, and write them to the requested output pickle file. These keys should be pointers to scalar values (i.e. something like `objectinfo.ra` is OK, but something like 'times' won't work because it's a vector). Generally, this works with light curve reading functions that produce lcdicts as detailed in the docstring for `lcproc.register_lcformat`. Once you've registered your light curve reader functions using the `lcproc.register_lcformat` function, pass in the `formatkey` associated with your light curve format, and this function will be able to read all light curves in that format as well as the object information stored in their `objectinfo` dict. Parameters ---------- basedir : str or list of str If this is a str, points to a single directory to search for light curves. If this is a list of str, it must be a list of directories to search for light curves. All of these will be searched to find light curve files matching either your light curve format's default fileglob (when you registered your LC format), or a specific fileglob that you can pass in using the `fileglob` kwargh here. If the `recursive` kwarg is set, the provided directories will be searched recursively. If `use_list_of_filenames` is not None, it will override this argument and the function will take those light curves as the list of files it must process instead of whatever is specified in `basedir`. outfile : str This is the name of the output file to write. This will be a pickle file, so a good convention to use for this name is something like 'my-lightcurve-catalog.pkl'. use_list_of_filenames : list of str or None Use this kwarg to override whatever is provided in `basedir` and directly pass in a list of light curve files to process. This can speed up this function by a lot because no searches on disk will be performed to find light curve files matching `basedir` and `fileglob`. lcformat : str This is the `formatkey` associated with your light curve format, which you previously passed in to the `lcproc.register_lcformat` function. This will be used to look up how to find and read the light curves specified in `basedir` or `use_list_of_filenames`. lcformatdir : str or None If this is provided, gives the path to a directory when you've stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with `lcformat` to specify an LC format JSON file that's not currently registered with lcproc. fileglob : str or None If provided, is a string that is a valid UNIX filename glob. Used to override the default fileglob for this LC format when searching for light curve files in `basedir`. recursive : bool If True, the directories specified in `basedir` will be searched recursively for all light curve files that match the default fileglob for this LC format or a specific one provided in `fileglob`. columns : list of str This is a list of keys in the lcdict produced by your light curve reader function that contain object information, which will be extracted and put into the output light curve catalog. It's highly recommended that your LC reader function produce a lcdict that contains at least the default keys shown here. The lcdict keys to extract are specified by using an address scheme: - First level dict keys can be specified directly: e.g., 'objectid' will extract lcdict['objectid'] - Keys at other levels can be specified by using a period to indicate the level: - e.g., 'objectinfo.ra' will extract lcdict['objectinfo']['ra'] - e.g., 'objectinfo.varinfo.features.stetsonj' will extract lcdict['objectinfo']['varinfo']['features']['stetsonj'] makecoordindex : list of two str or None This is used to specify which lcdict keys contain the right ascension and declination coordinates for this object. If these are provided, the output light curve catalog will have a kdtree built on all object coordinates, which enables fast spatial searches and cross-matching to external catalogs by `checkplot` and `lcproc` functions. field_fitsfile : str or None If this is not None, it should be the path to a FITS image containing the objects these light curves are for. If this is provided, `make_lclist` will use the WCS information in the FITS itself if `field_wcsfrom` is None (or from a WCS header file pointed to by `field_wcsfrom`) to obtain x and y pixel coordinates for all of the objects in the field. A finder chart will also be made using `astrobase.plotbase.fits_finder_chart` using the corresponding `field_scale`, `_stretch`, `_colormap`, `_findersize`, `_pltopts`, `_grid`, and `_gridcolors` kwargs for that function, reproduced here to enable customization of the finder chart plot. field_wcsfrom : str or None If `wcsfrom` is None, the WCS to transform the RA/Dec to pixel x/y will be taken from the FITS header of `fitsfile`. If this is not None, it must be a FITS or similar file that contains a WCS header in its first extension. field_scale : astropy.visualization.Interval object `scale` sets the normalization for the FITS pixel values. This is an astropy.visualization Interval object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on `scale` and `stretch` objects. field_stretch : astropy.visualization.Stretch object `stretch` sets the stretch function for mapping FITS pixel values to output pixel values. This is an astropy.visualization Stretch object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on `scale` and `stretch` objects. field_colormap : matplotlib Colormap object `colormap` is a matplotlib color map object to use for the output image. field_findersize : None or tuple of two ints If `findersize` is None, the output image size will be set by the NAXIS1 and NAXIS2 keywords in the input `fitsfile` FITS header. Otherwise, `findersize` must be a tuple with the intended x and y size of the image in inches (all output images will use a DPI = 100). field_pltopts : dict `field_pltopts` controls how the overlay points will be plotted. This a dict with standard matplotlib marker, etc. kwargs as key-val pairs, e.g. 'markersize', 'markerfacecolor', etc. The default options make red outline circles at the location of each object in the overlay. field_grid : bool `grid` sets if a grid will be made on the output image. field_gridcolor : str `gridcolor` sets the color of the grid lines. This is a usual matplotib color spec string. field_zoomcontain : bool `field_zoomcontain` controls if the finder chart will be zoomed to just contain the overlayed points. Everything outside the footprint of these points will be discarded. maxlcs : int or None This sets how many light curves to process in the input LC list generated by searching for LCs in `basedir` or in the list provided as `use_list_of_filenames`. nworkers : int This sets the number of parallel workers to launch to collect information from the light curves. Returns ------- str Returns the path to the generated light curve catalog pickle file.
# Since the HATPI image will be hopelessly crowded,
# let's use a Digitized Sky Survey finder chart for this field
# we'll use astrobase.services.skyview to get a FITS from the
# NASA SkyView service
from astrobase.services import skyview
help(skyview.get_stamp)
Help on function get_stamp in module astrobase.services.skyview: get_stamp(ra, decl, survey='DSS2 Red', scaling='Linear', sizepix=300, forcefetch=False, cachedir='~/.astrobase/stamp-cache', timeout=10.0, verbose=True) This gets a FITS cutout from the NASA GSFC SkyView service. ra, decl are decimal equatorial coordinates for the cutout center. survey is the name of the survey to get the stamp for. This is 'DSS2 Red' by default. scaling is the type of pixel value scaling to apply to the cutout. This is 'Linear' by default. sizepix is the size of the cutout in pixels. cachedir points to the astrobase stamp-cache directory. timeout is the amount of time in seconds to wait for a response from the service.
# let's get a DSS image for the center of the cluster
# sizepix is set to 2048, and DSS is about 1 arcsec/pix, so this is about 0.6 deg wide
skyview.get_stamp(277.9,-19.25, sizepix=2048, timeout=30.0)
{'fitsfile': '/Users/waqasbhatti/.astrobase/stamp-cache/04b609c8019b3ccb42f670d8ee0606e3eb77fd6d78f5efee0e39713de6cc138b.fits.gz', 'params': {'decl': -19.25, 'ra': 277.9, 'scaling': 'Linear', 'sizepix': 2048, 'survey': 'DSS2 Red'}, 'provenance': 'new download'}
# now generate our full catalog and a nice annotated finder chart along with it
lclist = make_lclist(
'csvlcs', # directory to look in
'IC4725-catalog.pkl', # output catalog pickle to make
lcformat='hplc', # specifying our custom LC format key
columns=['objectid','objectinfo.ra','objectinfo.decl',
'objectinfo.bmag','objectinfo.vmag','objectinfo.rmag',
'objectinfo.sdssu','objectinfo.sdssg','objectinfo.sdssr',
'objectinfo.sdssi','objectinfo.sdssz'], # specifying catalog columns to extract
nworkers=4, # number of parallel workers to launch
field_fitsfile='/Users/waqasbhatti/.astrobase/stamp-cache/04b609c8019b3ccb42f670d8ee0606e3eb77fd6d78f5efee0e39713de6cc138b.fits.gz',
field_findersize=(10,10), # to make the PNG a bit more tractable
)
[2018-03-20T00:53:38Z - INFO] searching for hplc light curves in csvlcs ... [2018-03-20T00:53:38Z - INFO] found 605 light curves [2018-03-20T00:53:38Z - INFO] collecting light curve info... [2018-03-20T00:54:02Z - INFO] kdtree generated for (ra, decl): ['objectinfo.ra', 'objectinfo.decl'] [2018-03-20T00:54:03Z - INFO] generated a finder PNG with an object position overlay for this LC list: IC4725-catalog.png [2018-03-20T00:54:03Z - INFO] done. LC info -> IC4725-catalog.pkl
# Let's look at this PNG
from IPython.display import Image
Image('IC4725-catalog.png')
The objects in the catalog pickle are highlighted on the finder chart. We can also look at the catalog pickle directly by reading it in.
import pickle
# we can read the catalog in like any other pickle
with open(lclist,'rb') as infd:
objcat = pickle.load(infd)
# this is a dict
objcat.keys()
dict_keys(['basedir', 'lcformat', 'fileglob', 'recursive', 'columns', 'makecoordindex', 'nfiles', 'objects', 'kdtree'])
# all the objects are in the 'objects' key
objcat['objects'].keys()
dict_keys(['lcfname', 'ndet_iep', 'objectid', 'ra', 'decl', 'bmag', 'vmag', 'rmag', 'sdssu', 'sdssg', 'sdssr', 'sdssi', 'sdssz'])
# this is our object from before
ind = objcat['objects']['objectid'] == 'HAT-529-0004752'
# here's how to pull out info for an object from the catalog
objcat['objects']['lcfname'][ind], objcat['objects']['ra'][ind], objcat['objects']['decl'][ind],
(array(['csvlcs/HAT-529-0004752-hplc.csv'], dtype='<U31'), array([ 278.08537]), array([-18.45812]))
# The object catalog pickle will be used for collecting neighbor information for
# objects when we make checkplots for them, and can then be used to determine
# if any observed variability also appears in the neighbors and
# if the object is instead a blend
# What we can also do with this catalog pickle is filter it according to various
# criteria. This can be done with the lcproc.filter_lclist function.
# This function can also take a FITS image, with identical arguments to make_lclist.
# This can be used to quickly visualize the effects of filtering the object catalog
# based on the various criteria used as arguments for filter_lclist.
from astrobase.lcproc.catalogs import filter_lclist
help(filter_lclist)
Help on function filter_lclist in module astrobase.lcproc.catalogs: filter_lclist(lc_catalog, objectidcol='objectid', racol='ra', declcol='decl', xmatchexternal=None, xmatchdistarcsec=3.0, externalcolnums=(0, 1, 2), externalcolnames=['objectid', 'ra', 'decl'], externalcoldtypes='U20,f8,f8', externalcolsep=None, externalcommentchar='#', conesearch=None, conesearchworkers=1, columnfilters=None, field_fitsfile=None, field_wcsfrom=None, field_scale=<astropy.visualization.interval.ZScaleInterval object at 0x1121505f8>, field_stretch=<astropy.visualization.stretch.LinearStretch object at 0x1172da160>, field_colormap=<matplotlib.colors.LinearSegmentedColormap object at 0x1119af4a8>, field_findersize=None, field_pltopts={'marker': 'o', 'markersize': 10.0, 'markerfacecolor': 'none', 'markeredgewidth': 2.0, 'markeredgecolor': 'red'}, field_grid=False, field_gridcolor='k', field_zoomcontain=True, copylcsto=None) This is used to perform cone-search, cross-match, and column-filter operations on a light curve catalog generated by `make_lclist`. Uses the output of `make_lclist` above. This function returns a list of light curves matching various criteria specified by the `xmatchexternal`, `conesearch`, and `columnfilters kwargs`. Use this function to generate input lists for other lcproc functions, e.g. `lcproc.lcvfeatures.parallel_varfeatures`, `lcproc.periodfinding.parallel_pf`, and `lcproc.lcbin.parallel_timebin`, among others. The operations are applied in this order if more than one is specified: `xmatchexternal` -> `conesearch` -> `columnfilters`. All results from these operations are joined using a logical AND operation. Parameters ---------- objectidcol : str This is the name of the object ID column in the light curve catalog. racol : str This is the name of the RA column in the light curve catalog. declcol : str This is the name of the Dec column in the light curve catalog. xmatchexternal : str or None If provided, this is the filename of a text file containing objectids, ras and decs to match the objects in the light curve catalog to by their positions. xmatchdistarcsec : float This is the distance in arcseconds to use when cross-matching to the external catalog in `xmatchexternal`. externalcolnums : sequence of int This a list of the zero-indexed column numbers of columns to extract from the external catalog file. externalcolnames : sequence of str This is a list of names of columns that will be extracted from the external catalog file. This is the same length as `externalcolnums`. These must contain the names provided as the `objectid`, `ra`, and `decl` column names so this function knows which column numbers correspond to those columns and can use them to set up the cross-match. externalcoldtypes : str This is a CSV string containing numpy dtype definitions for all columns listed to extract from the external catalog file. The number of dtype definitions should be equal to the number of columns to extract. externalcolsep : str or None The column separator to use when extracting columns from the external catalog file. If None, any whitespace between columns is used as the separator. externalcommentchar : str The character indicating that a line in the external catalog file is to be ignored. conesearch : list of float This is used to specify cone-search parameters. It should be a three element list: [center_ra_deg, center_decl_deg, search_radius_deg] conesearchworkers : int The number of parallel workers to launch for the cone-search operation. columnfilters : list of str This is a list of strings indicating any filters to apply on each column in the light curve catalog. All column filters are applied in the specified sequence and are combined with a logical AND operator. The format of each filter string should be: '<lc_catalog column>|<operator>|<operand>' where: - <lc_catalog column> is a column in the lc_catalog pickle file - <operator> is one of: 'lt', 'gt', 'le', 'ge', 'eq', 'ne', which correspond to the usual operators: <, >, <=, >=, ==, != respectively. - <operand> is a float, int, or string. field_fitsfile : str or None If this is not None, it should be the path to a FITS image containing the objects these light curves are for. If this is provided, `make_lclist` will use the WCS information in the FITS itself if `field_wcsfrom` is None (or from a WCS header file pointed to by `field_wcsfrom`) to obtain x and y pixel coordinates for all of the objects in the field. A finder chart will also be made using `astrobase.plotbase.fits_finder_chart` using the corresponding `field_scale`, `_stretch`, `_colormap`, `_findersize`, `_pltopts`, `_grid`, and `_gridcolors` kwargs for that function, reproduced here to enable customization of the finder chart plot. field_wcsfrom : str or None If `wcsfrom` is None, the WCS to transform the RA/Dec to pixel x/y will be taken from the FITS header of `fitsfile`. If this is not None, it must be a FITS or similar file that contains a WCS header in its first extension. field_scale : astropy.visualization.Interval object `scale` sets the normalization for the FITS pixel values. This is an astropy.visualization Interval object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on `scale` and `stretch` objects. field_stretch : astropy.visualization.Stretch object `stretch` sets the stretch function for mapping FITS pixel values to output pixel values. This is an astropy.visualization Stretch object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on `scale` and `stretch` objects. field_colormap : matplotlib Colormap object `colormap` is a matplotlib color map object to use for the output image. field_findersize : None or tuple of two ints If `findersize` is None, the output image size will be set by the NAXIS1 and NAXIS2 keywords in the input `fitsfile` FITS header. Otherwise, `findersize` must be a tuple with the intended x and y size of the image in inches (all output images will use a DPI = 100). field_pltopts : dict `field_pltopts` controls how the overlay points will be plotted. This a dict with standard matplotlib marker, etc. kwargs as key-val pairs, e.g. 'markersize', 'markerfacecolor', etc. The default options make red outline circles at the location of each object in the overlay. field_grid : bool `grid` sets if a grid will be made on the output image. field_gridcolor : str `gridcolor` sets the color of the grid lines. This is a usual matplotib color spec string. field_zoomcontain : bool `field_zoomcontain` controls if the finder chart will be zoomed to just contain the overlayed points. Everything outside the footprint of these points will be discarded. copylcsto : str If this is provided, it is interpreted as a directory target to copy all the light curves that match the specified conditions. Returns ------- tuple Returns a two elem tuple: (matching_object_lcfiles, matching_objectids) if conesearch and/or column filters are used. If `xmatchexternal` is also used, a three-elem tuple is returned: (matching_object_lcfiles, matching_objectids, extcat_matched_objectids).
# For example, if we want only the light curves for objects that are:
# - within 10 arcminutes of the cluster center at (277.9, -19.25), and
# - brighter than SDSS r < 12.0
# We'll also pass in our existing FITS image to see if the filtering works as expected
filtered_lcs, filtered_objectids = filter_lclist(
lclist, # the object catalog pickle we created with make_lclist
conesearch=[277.9, -19.25, 10.0/60.0], # this is a cone-search filter specification
columnfilters=['sdssr|lt|12.0'], # this is a list of filters on LC objectinfo conditions
field_fitsfile='/Users/waqasbhatti/.astrobase/stamp-cache/04b609c8019b3ccb42f670d8ee0606e3eb77fd6d78f5efee0e39713de6cc138b.fits.gz',
field_findersize=(10,10), # to make the PNG a bit more tractable
)
[2018-03-20T01:00:11Z - INFO] cone search: objects within 0.1667 deg of (277.900, -19.250): 31 [2018-03-20T01:00:11Z - INFO] filter: sdssr|lt|12.0 -> objects matching: 239 [2018-03-20T01:00:12Z - INFO] generated a finder PNG with an object position overlay for this filtered LC list: IC4725-catalog-conesearch_RA277.900_DEC-19.250_RAD0.16667_filter_sdssr_lt_12.0.png [2018-03-20T01:00:12Z - INFO] done. objects matching all filters: 20
# Let's look at the finder to see if this did what was expected
Image('IC4725-catalog-conesearch_RA277.900_DEC-19.250_RAD0.16667_filter_sdssr_lt_12.0.png')
# Looks about as expected.
# We can now easily operate on this subset of the LC collection directly, turning it into a list
# these lists can be passed directly to various lcproc functions like parallel_pf_lclist, etc.
filtered_lcs.tolist()
['csvlcs/HAT-577-0000044-hplc.csv', 'csvlcs/HAT-577-0000221-hplc.csv', 'csvlcs/HAT-577-0000705-hplc.csv', 'csvlcs/HAT-577-0001779-hplc.csv', 'csvlcs/HAT-577-0001797-hplc.csv', 'csvlcs/HAT-577-0001928-hplc.csv', 'csvlcs/HAT-577-0002340-hplc.csv', 'csvlcs/HAT-577-0002857-hplc.csv', 'csvlcs/HAT-577-0003467-hplc.csv', 'csvlcs/HAT-577-0004295-hplc.csv', 'csvlcs/HAT-577-0005172-hplc.csv', 'csvlcs/HAT-577-0005901-hplc.csv', 'csvlcs/HAT-577-0007232-hplc.csv', 'csvlcs/HAT-577-0007519-hplc.csv', 'csvlcs/HAT-577-0007645-hplc.csv', 'csvlcs/HAT-577-0008322-hplc.csv', 'csvlcs/HAT-577-0018039-hplc.csv', 'csvlcs/HAT-577-0019056-hplc.csv', 'csvlcs/HAT-577-0035505-hplc.csv', 'csvlcs/HAT-577-1792911-hplc.csv']
# The next thing we're going to do is to collect various external catalogs to cross-match
# our objects against. This will add more information to the checkplots and further help
# us classify variability.
from astrobase.checkplot.pkl_xmatch import load_xmatch_external_catalogs
# To do this, we need some external catalogs in a specific format. This is described in the
# checkplot.load_external_xmatch_catalog function.
help(load_xmatch_external_catalogs)
Help on function load_xmatch_external_catalogs in module astrobase.checkplot.pkl_xmatch: load_xmatch_external_catalogs(xmatchto, xmatchkeys, outfile=None) This loads the external xmatch catalogs into a dict for use in an xmatch. Parameters ---------- xmatchto : list of str This is a list of paths to all the catalog text files that will be loaded. The text files must be 'CSVs' that use the '|' character as the separator betwen columns. These files should all begin with a header in JSON format on lines starting with the '#' character. this header will define the catalog and contains the name of the catalog and the column definitions. Column definitions must have the column name and the numpy dtype of the columns (in the same format as that expected for the numpy.genfromtxt function). Any line that does not begin with '#' is assumed to be part of the columns in the catalog. An example is shown below:: # {"name":"NSVS catalog of variable stars", # "columns":[ # {"key":"objectid", "dtype":"U20", "name":"Object ID", "unit": null}, # {"key":"ra", "dtype":"f8", "name":"RA", "unit":"deg"}, # {"key":"decl","dtype":"f8", "name": "Declination", "unit":"deg"}, # {"key":"sdssr","dtype":"f8","name":"SDSS r", "unit":"mag"}, # {"key":"vartype","dtype":"U20","name":"Variable type", "unit":null} # ], # "colra":"ra", # "coldec":"decl", # "description":"Contains variable stars from the NSVS catalog"} objectid1 | 45.0 | -20.0 | 12.0 | detached EB objectid2 | 145.0 | 23.0 | 10.0 | RRab objectid3 | 12.0 | 11.0 | 14.0 | Cepheid . . . xmatchkeys : list of lists This is the list of lists of column names (as str) to get out of each `xmatchto` catalog. This should be the same length as `xmatchto` and each element here will apply to the respective file in `xmatchto`. outfile : str or None If this is not None, set this to the name of the pickle to write the collected xmatch catalogs to. this pickle can then be loaded transparently by the :py:func:`astrobase.checkplot.pkl.checkplot_dict`, :py:func:`astrobase.checkplot.pkl.checkplot_pickle` functions to provide xmatch info to the :py:func:`astrobase.checkplot.pkl_xmatch.xmatch_external_catalogs` function below. If this is None, will return the loaded xmatch catalogs directly. This will be a huge dict, so make sure you have enough RAM. Returns ------- str or dict Based on the `outfile` kwarg, will either return the path to a collected xmatch pickle file or the collected xmatch dict.
# I've already created some useful catalogs to cross-match against.
# These are available at https://wbhatti.org/abfiles/catalogs/
# Let's use the AAVSO VSX catalog and GAIA-TGAS catalog as our
# external xmatch catalogs
!ls xmatch-cats
IC4725-xmatch-catalogs.pkl gaia-tgas-sources.csv.gz fixed-aavso-vsx-20180115.csv.gz
# Let's load these into a pickle that can then be used by lcproc to xmatch
# LC collection members against external catalogs
# these are the columns to get from the GAIA-TGAS catalog
tgascols = ['sourceid','ra','decl','pmra','pmdecl','parallax_mas','parallax_err_mas',
'distance_pc_naive','ref_epoch']
# these are the columns to get from the VSX catalog
aavsocols = ['vsxid','objectname','varflag','vartype','period','ra','decl']
# load them into a pickle
xmd = load_xmatch_external_catalogs(
['xmatch-cats/gaia-tgas-sources.csv.gz',
'xmatch-cats/fixed-aavso-vsx-20180115.csv.gz'],
[tgascols, aavsocols],
outfile='xmatch-cats/IC4725-xmatch-catalogs.pkl'
)
That concludes our preparatory work. Next, we'll see how to run period-finding on our light curves.
We'll use the lcproc.periodsearch.runpf
and lcproc.periodsearch.parallel_pf_lcdir
functions to run period-finding operations on all of our light curves. Now that the LC format has been registered and is active, we just provide a lcformat='hplc'
keyword argument to these functions to make them work.
# For a single light curve, we use the lcproc.periodsearch.runpf function
from astrobase.lcproc.periodsearch import runpf, parallel_pf_lcdir, PFMETHODS
help(runpf)
Help on function runpf in module astrobase.lcproc.periodsearch: runpf(lcfile, outdir, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, pfmethods=('gls', 'pdm', 'mav', 'win'), pfkwargs=({}, {}, {}, {}), sigclip=10.0, getblssnr=False, nworkers=8, minobservations=500, excludeprocessed=False, raiseonfail=False) This runs the period-finding for a single LC. Parameters ---------- lcfile : str The light curve file to run period-finding on. outdir : str The output directory where the result pickle will go. timecols : list of str or None The timecol keys to use from the lcdict in calculating the features. magcols : list of str or None The magcol keys to use from the lcdict in calculating the features. errcols : list of str or None The errcol keys to use from the lcdict in calculating the features. lcformat : str This is the `formatkey` associated with your light curve format, which you previously passed in to the `lcproc.register_lcformat` function. This will be used to look up how to find and read the light curves specified in `basedir` or `use_list_of_filenames`. lcformatdir : str or None If this is provided, gives the path to a directory when you've stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with `lcformat` to specify an LC format JSON file that's not currently registered with lcproc. pfmethods : list of str This is a list of period finding methods to run. Each element is a string matching the keys of the `PFMETHODS` dict above. By default, this runs GLS, PDM, AoVMH, and the spectral window Lomb-Scargle periodogram. pfkwargs : list of dicts This is used to provide any special kwargs as dicts to each period-finding method function specified in `pfmethods`. sigclip : float or int or sequence of two floats/ints or None If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series. If a list of two ints/floats is provided, the function will perform an 'asymmetric' sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of "dimming" and "brightening" is set by *physics* (not the magnitude system), which is why the `magsarefluxes` kwarg must be correctly set. If `sigclip` is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output. getblssnr : bool If this is True and BLS is one of the methods specified in `pfmethods`, will also calculate the stats for each best period in the BLS results: transit depth, duration, ingress duration, refit period and epoch, and the SNR of the transit. nworkers : int The number of parallel period-finding workers to launch. minobservations : int The minimum number of finite LC points required to process a light curve. excludeprocessed : bool If this is True, light curves that have existing period-finding result pickles in `outdir` will not be processed. FIXME: currently, this uses a dumb method of excluding already-processed files. A smarter way to do this is to (i) generate a SHA512 cachekey based on a repr of `{'lcfile', 'timecols', 'magcols', 'errcols', 'lcformat', 'pfmethods', 'sigclip', 'getblssnr', 'pfkwargs'}`, (ii) make sure all list kwargs in the dict are sorted, (iii) check if the output file has the same cachekey in its filename (last 8 chars of cachekey should work), so the result was processed in exactly the same way as specifed in the input to this function, and can therefore be ignored. Will implement this later. raiseonfail : bool If something fails and this is True, will raise an Exception instead of returning None at the end. Returns ------- str The path to the output period-finding result pickle.
# these are the period-finder methods currently implemented in astrobase
PFMETHODS
{'bls': <function astrobase.periodbase.kbls.bls_parallel_pfind(times, mags, errs, magsarefluxes=False, startp=0.1, endp=100.0, stepsize=0.0001, mintransitduration=0.01, maxtransitduration=0.4, nphasebins=200, autofreq=True, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, verbose=True, nworkers=None)>, 'gls': <function astrobase.periodbase.zgls.pgen_lsp(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, workchunksize=None, glspfunc=<function _glsp_worker at 0x117541730>, verbose=True)>, 'aov': <function astrobase.periodbase.saov.aov_periodfind(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, normalize=True, phasebinsize=0.05, mindetperbin=9, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, verbose=True)>, 'mav': <function astrobase.periodbase.smav.aovhm_periodfind(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, normalize=True, nharmonics=6, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, verbose=True)>, 'pdm': <function astrobase.periodbase.spdm.stellingwerf_pdm(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, normalize=False, phasebinsize=0.05, mindetperbin=9, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, verbose=True)>, 'acf': <function astrobase.periodbase.macf.macf_period_find(times, mags, errs, fillgaps=0.0, filterwindow=11, forcetimebin=None, maxlags=None, maxacfpeaks=10, smoothacf=21, smoothfunc=<function _smooth_acf_savgol at 0x117540400>, smoothfunckwargs=None, magsarefluxes=False, sigclip=3.0, verbose=True, periodepsilon=0.1, nworkers=None, startp=None, endp=None, autofreq=None, stepsize=None)>, 'win': <function astrobase.periodbase.zgls.specwindow_lsp(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, glspfunc=<function _glsp_worker_specwindow at 0x1175417b8>, verbose=True)>}
# let's use GLS, PDM, and BLS to cover all of our bases
pf = runpf('csvlcs/HAT-529-0004752-hplc.csv', # the input LC
'period-finding', # the output directory
lcformat='hplc', # specifying our custom LC format
pfmethods=['gls','pdm','bls'], # a list of period-finders to run
pfkwargs=[{}, {}, {}, {}], # optional keyword arguments for each PF function
sigclip=10.0, # the sigclip to apply to the LC
getblssnr=True, # optionally get the BLS model SNR
nworkers=8) # use 8 workers on this laptop
# see what's inside this pickle
with open('period-finding/periodfinding-HAT-529-0004752.pkl','rb') as infd:
pfd = pickle.load(infd)
# this contains the results for all of the period-finder methods
# and light curve columns
pfd
{'iep': {'0-gls': {'bestlspval': 0.21473579487435882, 'bestperiod': 5.1481485148513757, 'kwargs': {'autofreq': True, 'endp': None, 'nbestpeaks': 5, 'periodepsilon': 0.1, 'sigclip': 10.0, 'startp': None, 'stepsize': 0.0001}, 'lspvals': array([ 0.00032977, 0.00038313, 0.00075949, ..., 0.00190141, 0.00162813, 0.00109708]), 'method': 'gls', 'nbestlspvals': [0.21473579487435882, 0.17700570001798493, 0.14126582033100382, 0.081988451875924406, 0.078509675529540371], 'nbestpeaks': 5, 'nbestperiods': [5.1481485148513757, 0.83595337620576982, 1.2380071428571162, 0.45491076115484591, 4.9996442307691229], 'omegas': array([ 6.04195424e-02, 7.25034509e-02, 8.45873594e-02, ..., 6.28000724e+01, 6.28121563e+01, 6.28242402e+01]), 'periods': array([ 1.03992600e+02, 8.66605000e+01, 7.42804286e+01, ..., 1.00050606e-01, 1.00031358e-01, 1.00012118e-01])}, '1-pdm': {'bestlspval': 0.7389789287602877, 'bestperiod': 5.1481485148513748, 'kwargs': {'autofreq': True, 'endp': None, 'mindetperbin': 9, 'nbestpeaks': 5, 'normalize': False, 'periodepsilon': 0.1, 'phasebinsize': 0.05, 'sigclip': 10.0, 'startp': None, 'stepsize': 0.0001}, 'lspvals': array([ 0.94595389, 0.95845235, 0.94386395, ..., 0.9969419 , 0.99690833, 0.99827216]), 'method': 'pdm', 'nbestlspvals': [0.7389789287602877, 0.77427993694310937, 0.80199063830228878, 0.80212147925318633, 0.80285745258187746], 'nbestpeaks': 5, 'nbestperiods': [5.1481485148513748, 10.399259999999776, 1.6719067524115396, 0.83595337620576982, 30.586058823528756], 'periods': array([ 1.03992600e+02, 8.66605000e+01, 7.42804286e+01, ..., 1.00050606e-01, 1.00031358e-01, 1.00012118e-01])}, '2-bls': {'altsnr': [0.92586414154286711, 0.8672005067623374, 0.92586414154286711, 0.92613667967535718, 0.74818557371945138], 'bestlspval': 0.012316806792635112, 'bestperiod': 5.1500795397243442, 'blsresult': [{'bestperiod': 0.9922018263804309, 'bestpower': inf, 'power': array([ 0.0045659 , 0.00442832, 0.00442723, ..., 0.00257948, 0.00254567, 0.00252592]), 'transdepth': inf, 'transduration': 1.0, 'transegressbin': 162, 'transingressbin': 1}, {'bestperiod': 0.49806691742734854, 'bestpower': inf, 'power': array([ 0.0025507 , 0.00254377, 0.00253122, ..., 0.0019834 , 0.00200273, 0.00200783]), 'transdepth': inf, 'transduration': 1.0, 'transegressbin': 162, 'transingressbin': 1}, {'bestperiod': 0.3122549956292233, 'bestpower': 0.0032502341146095435, 'power': array([ 0.00198665, 0.00202378, 0.00201561, ..., 0.00091857, 0.00088462, 0.00087088]), 'transdepth': 0.0065663065986843305, 'transduration': 0.4293727032428599, 'transegressbin': 170, 'transingressbin': 86}, {'bestperiod': 0.2075856283880216, 'bestpower': 0.003438092304432587, 'power': array([ 0.0009365 , 0.00090954, 0.0009132 , ..., 0.00134948, 0.00132371, 0.00132733]), 'transdepth': 0.007074515008879915, 'transduration': 0.38243826961511185, 'transegressbin': 23, 'transingressbin': 148}, {'bestperiod': 0.19198784151957732, 'bestpower': 0.0026283018187057937, 'power': array([ 0.00135153, 0.00133701, 0.00133775, ..., 0.0008563 , 0.00086144, 0.00086383]), 'transdepth': 0.005727824828014367, 'transduration': 0.3013990071562117, 'transegressbin': 125, 'transingressbin': 67}, {'bestperiod': 0.15339557311886448, 'bestpower': 0.0020585612782315385, 'power': array([ 0.00089416, 0.0009161 , 0.00091397, ..., 0.00101389, 0.0010044 , 0.00099606]), 'transdepth': 0.004168380324585065, 'transduration': 0.42182966926697185, 'transegressbin': 87, 'transingressbin': 4}, {'bestperiod': 0.12613229590288053, 'bestpower': 0.001963524837409177, 'power': array([ 0.00100108, 0.00100291, 0.0010272 , ..., 0.00103095, 0.0010123 , 0.00100209]), 'transdepth': 0.00426827165177351, 'transduration': 0.695893237057572, 'transegressbin': 154, 'transingressbin': 14}, {'bestperiod': 0.10879025844803525, 'bestpower': 0.0020538240883662883, 'power': array([ 0.00102326, 0.00101145, 0.00099875, ..., 0.0012171 , 0.00123766, 0.00122574]), 'transdepth': 0.004108065175070268, 'transduration': 0.5071239765327832, 'transegressbin': 39, 'transingressbin': 139}], 'frequencies': array([ 0.01 , 0.01002404, 0.01004808, ..., 9.99994159, 9.99996563, 9.99998967]), 'kwargs': {'autofreq': True, 'endp': 100.0, 'maxtransitduration': 0.8, 'mintransitduration': 0.01, 'nbestpeaks': 5, 'nphasebins': 200, 'periodepsilon': 0.1, 'sigclip': 10.0, 'startp': 0.1, 'stepsize': 2.4040172089168399e-05}, 'lspvals': array([ 0.0045659 , 0.00442832, 0.00442723, ..., 0.0012171 , 0.00123766, 0.00122574]), 'maxtransitduration': 0.8, 'method': 'bls', 'mintransitduration': 0.01, 'nbestlspvals': [0.012316806792635112, 0.010683724712070771, 0.010221237995478983, 0.0099840712365885445, 0.0093200912201696576], 'nbestpeaks': 5, 'nbestperiods': [5.1500795397243442, 0.8356012280547892, 5.2356444273221143, 5.0660321464560578, 1.2371613663258947], 'nfreq': 415555, 'nphasebins': 200, 'periods': array([ 100. , 99.76017482, 99.52149722, ..., 0.10000058, 0.10000034, 0.1000001 ]), 'snr': [0.92586414154286711, 0.67907987308711004, 0.92586414154286711, 0.92613667967535729, 0.60893425071561291], 'stepsize': 2.4040172089168399e-05, 'transitdepth': [0.024638644760793468, 0.021892077413929612, 0.024638644760793468, 0.024645897412590746, 0.018860282555760662], 'transitduration': [0.48243182257752565, 0.6086648185158919, 0.48243182257752565, 0.4824962929533879, 0.42634259557733223]}, 'pfmethods': ['0-gls', '1-pdm', '2-bls']}, 'kwargs': {'errcols': ['ire'], 'getblssnr': True, 'lcformat': 'hplc', 'magcols': ['iep'], 'pfkwargs': [{'magsarefluxes': False, 'nworkers': 8, 'sigclip': 10.0, 'verbose': False}, {'magsarefluxes': False, 'nworkers': 8, 'sigclip': 10.0, 'verbose': False}, {'magsarefluxes': False, 'nworkers': 8, 'sigclip': 10.0, 'verbose': False}, {}], 'pfmethods': ['gls', 'pdm', 'bls'], 'sigclip': 10.0, 'timecols': ['rjd']}, 'lcfbasename': 'HAT-529-0004752-hplc.csv', 'objectid': 'HAT-529-0004752'}
# we can quickly visualize any specific period-finder result with the plotbase module
from astrobase import plotbase
from IPython.display import Image
# let's plot the BLS spectrum
plotbase.plot_periodbase_lsp(pfd['iep']['2-bls'])
[2018-03-16T04:48:23Z - WRN!] no output file specified and no $DISPLAY set, saving to lsp-plot.png in current directory
'/Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/lsp-plot.png'
Image('lsp-plot.png')
# This looks promising, let's plot the phased LC using the best period
# first, we need to read in the light curve
lcdict = read_csv_lightcurve('csvlcs/HAT-529-0004752-hplc.csv')
# next, use times, mags and the best period from the period-finding
# result to make a phased LC plot
plotbase.plot_phased_mag_series(lcdict['rjd'],
lcdict['iep'],
pfd['iep']['2-bls']['bestperiod'],
phasebin=0.002)
[2018-03-16T04:48:31Z - INFO] spline fit done. nknots = 30, chisq = 316674.13137, reduced chisq = 20.45699 [2018-03-16T04:48:32Z - INFO] using period: 5.150080 d and epoch: 57184.623230 [2018-03-16T04:48:32Z - WRN!] no output file specified and no $DISPLAY set, saving to magseries-phased-plot.png in current directory
(5.1500795397243442, array([ 57184.62323]), '/Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/magseries-phased-plot.png')
Image('magseries-phased-plot.png')
Not likely to be an transiting system then. But BLS finds anything that's vaguely box-shaped, so this is fine. We'll now run period-finding on a bunch of light curves.
target_lcs = glob.glob('csvlcs/HAT*.csv')
target_lcs = sorted(target_lcs)
print(len(target_lcs))
605
# We will use the lcproc.parallel_pf function to operate on a list of LCs
from astrobase.lcproc.periodsearch import parallel_pf
help(parallel_pf)
Help on function parallel_pf in module astrobase.lcproc.periodsearch: parallel_pf(lclist, outdir, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, pfmethods=('gls', 'pdm', 'mav', 'win'), pfkwargs=({}, {}, {}, {}), sigclip=10.0, getblssnr=False, nperiodworkers=8, ncontrolworkers=1, liststartindex=None, listmaxobjects=None, minobservations=500, excludeprocessed=True) This drives the overall parallel period processing for a list of LCs. As a rough benchmark, 25000 HATNet light curves with up to 50000 points per LC take about 26 days in total for an invocation of this function using GLS+PDM+BLS, 10 periodworkers, and 4 controlworkers (so all 40 'cores') on a 2 x Xeon E5-2660v3 machine. Parameters ---------- lclist : list of str The list of light curve file to process. outdir : str The output directory where the period-finding result pickles will go. timecols : list of str or None The timecol keys to use from the lcdict in calculating the features. magcols : list of str or None The magcol keys to use from the lcdict in calculating the features. errcols : list of str or None The errcol keys to use from the lcdict in calculating the features. lcformat : str This is the `formatkey` associated with your light curve format, which you previously passed in to the `lcproc.register_lcformat` function. This will be used to look up how to find and read the light curves specified in `basedir` or `use_list_of_filenames`. lcformatdir : str or None If this is provided, gives the path to a directory when you've stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with `lcformat` to specify an LC format JSON file that's not currently registered with lcproc. pfmethods : list of str This is a list of period finding methods to run. Each element is a string matching the keys of the `PFMETHODS` dict above. By default, this runs GLS, PDM, AoVMH, and the spectral window Lomb-Scargle periodogram. pfkwargs : list of dicts This is used to provide any special kwargs as dicts to each period-finding method function specified in `pfmethods`. sigclip : float or int or sequence of two floats/ints or None If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series. If a list of two ints/floats is provided, the function will perform an 'asymmetric' sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of "dimming" and "brightening" is set by *physics* (not the magnitude system), which is why the `magsarefluxes` kwarg must be correctly set. If `sigclip` is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output. getblssnr : bool If this is True and BLS is one of the methods specified in `pfmethods`, will also calculate the stats for each best period in the BLS results: transit depth, duration, ingress duration, refit period and epoch, and the SNR of the transit. nperiodworkers : int The number of parallel period-finding workers to launch per object task. ncontrolworkers : int The number of controlling processes to launch. This effectively sets how many objects from `lclist` will be processed in parallel. liststartindex : int or None This sets the index from where to start in `lclist`. listmaxobjects : int or None This sets the maximum number of objects in `lclist` to run period-finding for in this invocation. Together with `liststartindex`, `listmaxobjects` can be used to distribute processing over several independent machines if the number of light curves is very large. minobservations : int The minimum number of finite LC points required to process a light curve. excludeprocessed : bool If this is True, light curves that have existing period-finding result pickles in `outdir` will not be processed. FIXME: currently, this uses a dumb method of excluding already-processed files. A smarter way to do this is to (i) generate a SHA512 cachekey based on a repr of `{'lcfile', 'timecols', 'magcols', 'errcols', 'lcformat', 'pfmethods', 'sigclip', 'getblssnr', 'pfkwargs'}`, (ii) make sure all list kwargs in the dict are sorted, (iii) check if the output file has the same cachekey in its filename (last 8 chars of cachekey should work), so the result was processed in exactly the same way as specifed in the input to this function, and can therefore be ignored. Will implement this later. Returns ------- list of str A list of the period-finding pickles created for all of input LCs processed.
# the arguments for this can mostly be copied over from our
# earlier call to lcproc.run_pf.
pfl = parallel_pf(target_lcs, # operate on this LC list
outdir='period-finding', # write output to this directory
lcformat='hplc', # use our custom LC format
pfmethods=['gls','pdm','bls'], # a list of period-finders to run
pfkwargs=[{}, {}, {}, {}], # optional keyword arguments for each PF function
sigclip=10.0, # the sigclip to apply to the LC
getblssnr=False, # let's not get the SNR to save time
nperiodworkers=8, # use 8 period workers on this laptop
ncontrolworkers=1) # use 1 control worker for managing processes
[2018-03-16T04:51:27Z - WRN!] periodfinding result for csvlcs/HAT-529-0004752-hplc.csv already exists at period-finding/periodfinding-HAT-529-0004752.pkl, skipping because excludeprocessed=True [2018-03-16T05:19:43Z - WRN!] periodfinding result for csvlcs/HAT-577-0000044-hplc.csv already exists at period-finding/periodfinding-HAT-577-0000044.pkl, skipping because excludeprocessed=True [2018-03-16T05:20:43Z - WRN!] periodfinding result for csvlcs/HAT-577-0000221-hplc.csv already exists at period-finding/periodfinding-HAT-577-0000221.pkl, skipping because excludeprocessed=True [2018-03-16T05:21:45Z - WRN!] periodfinding result for csvlcs/HAT-577-0000705-hplc.csv already exists at period-finding/periodfinding-HAT-577-0000705.pkl, skipping because excludeprocessed=True [2018-03-16T05:25:30Z - WRN!] periodfinding result for csvlcs/HAT-577-0001779-hplc.csv already exists at period-finding/periodfinding-HAT-577-0001779.pkl, skipping because excludeprocessed=True [2018-03-16T05:25:31Z - WRN!] periodfinding result for csvlcs/HAT-577-0001797-hplc.csv already exists at period-finding/periodfinding-HAT-577-0001797.pkl, skipping because excludeprocessed=True [2018-03-16T05:26:12Z - WRN!] periodfinding result for csvlcs/HAT-577-0001928-hplc.csv already exists at period-finding/periodfinding-HAT-577-0001928.pkl, skipping because excludeprocessed=True [2018-03-16T05:27:33Z - WRN!] periodfinding result for csvlcs/HAT-577-0002340-hplc.csv already exists at period-finding/periodfinding-HAT-577-0002340.pkl, skipping because excludeprocessed=True [2018-03-16T05:30:17Z - WRN!] periodfinding result for csvlcs/HAT-577-0002857-hplc.csv already exists at period-finding/periodfinding-HAT-577-0002857.pkl, skipping because excludeprocessed=True [2018-03-16T05:31:39Z - WRN!] periodfinding result for csvlcs/HAT-577-0003467-hplc.csv already exists at period-finding/periodfinding-HAT-577-0003467.pkl, skipping because excludeprocessed=True [2018-03-16T05:34:44Z - WRN!] periodfinding result for csvlcs/HAT-577-0004295-hplc.csv already exists at period-finding/periodfinding-HAT-577-0004295.pkl, skipping because excludeprocessed=True [2018-03-16T05:36:06Z - WRN!] periodfinding result for csvlcs/HAT-577-0005172-hplc.csv already exists at period-finding/periodfinding-HAT-577-0005172.pkl, skipping because excludeprocessed=True [2018-03-16T05:38:30Z - WRN!] periodfinding result for csvlcs/HAT-577-0005901-hplc.csv already exists at period-finding/periodfinding-HAT-577-0005901.pkl, skipping because excludeprocessed=True [2018-03-16T05:42:36Z - WRN!] periodfinding result for csvlcs/HAT-577-0007232-hplc.csv already exists at period-finding/periodfinding-HAT-577-0007232.pkl, skipping because excludeprocessed=True [2018-03-16T05:43:17Z - WRN!] periodfinding result for csvlcs/HAT-577-0007519-hplc.csv already exists at period-finding/periodfinding-HAT-577-0007519.pkl, skipping because excludeprocessed=True [2018-03-16T05:44:19Z - WRN!] periodfinding result for csvlcs/HAT-577-0007645-hplc.csv already exists at period-finding/periodfinding-HAT-577-0007645.pkl, skipping because excludeprocessed=True [2018-03-16T05:46:02Z - WRN!] periodfinding result for csvlcs/HAT-577-0008322-hplc.csv already exists at period-finding/periodfinding-HAT-577-0008322.pkl, skipping because excludeprocessed=True [2018-03-16T06:03:49Z - WRN!] periodfinding result for csvlcs/HAT-577-0018039-hplc.csv already exists at period-finding/periodfinding-HAT-577-0018039.pkl, skipping because excludeprocessed=True [2018-03-16T06:06:13Z - WRN!] periodfinding result for csvlcs/HAT-577-0019056-hplc.csv already exists at period-finding/periodfinding-HAT-577-0019056.pkl, skipping because excludeprocessed=True [2018-03-16T06:28:24Z - WRN!] periodfinding result for csvlcs/HAT-577-0035505-hplc.csv already exists at period-finding/periodfinding-HAT-577-0035505.pkl, skipping because excludeprocessed=True [2018-03-16T07:58:34Z - WRN!] periodfinding result for csvlcs/HAT-577-1792911-hplc.csv already exists at period-finding/periodfinding-HAT-577-1792911.pkl, skipping because excludeprocessed=True
# Looks like we're done, it took about 4 hours on a 2015 Macbook Pro (4-core/8-thread 2.2 Ghz Intel i7)
!ls period-finding | wc -l
605
Now that we're done with period-finding, we can go ahead and start making checkplots.
Checkplots contain all the information necessary to characterize an object's variability. At a minimum, they contain all the information from the object's light curve, the object's magnitudes, colors, and GAIA information, and phased light curves generated using results from period-finding. We can add more optional information to checkplots, including: neighbor light curves to check for blending, cross-matches against external catalogs, and color-magnitude diagrams of the light curve collection.
Below, we'll first make a checkplot for a single object and visualize it as a PNG.
# To make a checkplot complete with neighbor and xmatch info,
# we'll use lcproc.runcp
from astrobase.lcproc.checkplotgen import runcp
help(runcp)
Help on function runcp in module astrobase.lcproc.checkplotgen: runcp(pfpickle, outdir, lcbasedir, lcfname=None, cprenorm=False, lclistpkl=None, nbrradiusarcsec=60.0, maxnumneighbors=5, makeneighborlcs=True, fast_mode=False, gaia_max_timeout=60.0, gaia_mirror=None, xmatchinfo=None, xmatchradiusarcsec=3.0, minobservations=99, sigclip=10.0, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, skipdone=False, done_callback=None, done_callback_args=None, done_callback_kwargs=None) This makes a checkplot pickle for the given period-finding result pickle produced by `lcproc.periodfinding.runpf`. Parameters ---------- pfpickle : str or None This is the filename of the period-finding result pickle file created by `lcproc.periodfinding.runpf`. If this is None, the checkplot will be made anyway, but no phased LC information will be collected into the output checkplot pickle. This can be useful for just collecting GAIA and other external information and making LC plots for an object. outdir : str This is the directory to which the output checkplot pickle will be written. lcbasedir : str The base directory where this function will look for the light curve file associated with the object in the input period-finding result pickle file. lcfname : str or None This is usually None because we'll get the path to the light curve associated with this period-finding pickle from the pickle itself. If `pfpickle` is None, however, this function will use `lcfname` to look up the light curve file instead. If both are provided, the value of `lcfname` takes precedence. Providing the light curve file name in this kwarg is useful when you're making checkplots directly from light curve files and not including period-finder results (perhaps because period-finding takes a long time for large collections of LCs). cprenorm : bool Set this to True if the light curves should be renormalized by `checkplot.checkplot_pickle`. This is set to False by default because we do our own normalization in this function using the light curve's registered normalization function and pass the normalized times, mags, errs to the `checkplot.checkplot_pickle` function. lclistpkl : str or dict This is either the filename of a pickle or the actual dict produced by lcproc.make_lclist. This is used to gather neighbor information. nbrradiusarcsec : float The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity. maxnumneighbors : int The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object. makeneighborlcs : bool If True, will make light curve and phased light curve plots for all neighbors to the current object found in the catalog passed in using `lclistpkl`. fast_mode : bool or float This runs the external catalog operations in a "fast" mode, with short timeouts and not trying to hit external catalogs that take a long time to respond. If this is set to True, the default settings for the external requests will then become:: skyview_lookup = False skyview_timeout = 10.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False If this is a float, will run in "fast" mode with the provided timeout value in seconds and the following settings:: skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False gaia_max_timeout : float Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object's information. Note that if `fast_mode` is set, this is ignored. gaia_mirror : str or None This sets the GAIA mirror to use. This is a key in the `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each mirror. xmatchinfo : str or dict This is either the xmatch dict produced by the function `load_xmatch_external_catalogs` above, or the path to the xmatch info pickle file produced by that function. xmatchradiusarcsec : float This is the cross-matching radius to use in arcseconds. minobservations : int The minimum of observations the input object's mag/flux time-series must have for this function to plot its light curve and phased light curve. If the object has less than this number, no light curves will be plotted, but the checkplotdict will still contain all of the other information. sigclip : float or int or sequence of two floats/ints or None If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series. If a list of two ints/floats is provided, the function will perform an 'asymmetric' sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of "dimming" and "brightening" is set by *physics* (not the magnitude system), which is why the `magsarefluxes` kwarg must be correctly set. If `sigclip` is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output. lcformat : str This is the `formatkey` associated with your light curve format, which you previously passed in to the `lcproc.register_lcformat` function. This will be used to look up how to find and read the light curves specified in `basedir` or `use_list_of_filenames`. lcformatdir : str or None If this is provided, gives the path to a directory when you've stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with `lcformat` to specify an LC format JSON file that's not currently registered with lcproc. timecols : list of str or None The timecol keys to use from the lcdict in generating this checkplot. magcols : list of str or None The magcol keys to use from the lcdict in generating this checkplot. errcols : list of str or None The errcol keys to use from the lcdict in generating this checkplot. skipdone : bool This indicates if this function will skip creating checkplots that already exist corresponding to the current `objectid` and `magcol`. If `skipdone` is set to True, this will be done. done_callback : Python function or None This is used to provide a function to execute after the checkplot pickles are generated. This is useful if you want to stream the results of checkplot making to some other process, e.g. directly running an ingestion into an LCC-Server collection. The function will always get the list of the generated checkplot pickles as its first arg, and all of the kwargs for runcp in the kwargs dict. Additional args and kwargs can be provided by giving a list in the `done_callbacks_args` kwarg and a dict in the `done_callbacks_kwargs` kwarg. NOTE: the function you pass in here should be pickleable by normal Python if you want to use it with the parallel_cp and parallel_cp_lcdir functions below. done_callback_args : tuple or None If not None, contains any args to pass into the `done_callback` function. done_callback_kwargs : dict or None If not None, contains any kwargs to pass into the `done_callback` function. Returns ------- list of str This returns a list of checkplot pickle filenames with one element for each (timecol, magcol, errcol) combination provided in the default lcformat config or in the timecols, magcols, errcols kwargs.
# Let's make a checkplot for HAT-577-0001797
cpf = runcp(
'period-finding/periodfinding-HAT-577-0001797.pkl', # the period-finding result to use
'checkplots', # output directory to write checkplot pickles to
'csvlcs', # the directory containing the light curves
lclistpkl='IC4725-catalog.pkl', # the object catalog pickle we made with neighbor info
nbrradiusarcsec=60.0, # maximum distance in arcsec to look for neighbors
xmatchinfo='xmatch-cats/IC4725-xmatch-catalogs.pkl', # xmatch pickle we made
xmatchradiusarcsec=3.0, # maximum cross-match distance to use in arcsec
sigclip=10.0, # sigclip to apply to LC
lcformat='hplc' # our custom LC format
)
[2018-03-16T15:29:00Z - WRN!] one or more of the 'pmra', 'pmdecl', 'jmag' keys are missing from the input objectinfo dict, can't get proper motion features [2018-03-16T15:29:00Z - WRN!] failed: no GAIA objects found within 3.0 of object position (277.796, -19.330), closest object is at 10.685 arcsec away [2018-03-16T15:29:01Z - ERR!] spline fit failed, trying SavGol fit
/Users/waqasbhatti/py36-venv/lib/python3.6/site-packages/scipy/linalg/basic.py:1226: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver. warnings.warn(mesg, RuntimeWarning)
[2018-03-16T15:29:05Z - ERR!] spline fit failed, trying SavGol fit [2018-03-16T15:29:13Z - ERR!] no neighbors for HAT-577-0001797, not updating... [2018-03-16T15:29:14Z - INFO] done with HAT-577-0001797 -> ['/Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0001797-iep.pkl']
# we can visualize a checkplot pickle by converting it to PNG
checkplot.cp2png(cpf[0])
[2018-03-16T15:33:17Z - INFO] checkplot pickle -> checkplot PNG: /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0001797-iep.png OK
'/Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0001797-iep.png'
Image('/Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0001797-iep.png')
The canonical way of looking at checkplot pickles, especially large collections of them, is to use the checkplotserver webapp. We'll get into this later on. For now, let's make checkplots for the rest of our light curves. After that, we'll see how to add color-mag-diagram information to the checkplots in final preparation for browsing through them with the checkplotserver.
# We'll use lcproc.parallel_cp_pfdir and copy over our args
# from the previous call to lcproc.runcp
from astrobase.lcproc.checkplotgen import parallel_cp_pfdir
help(parallel_cp_pfdir)
Help on function parallel_cp_pfdir in module astrobase.lcproc.checkplotgen: parallel_cp_pfdir(pfpickledir, outdir, lcbasedir, pfpickleglob='periodfinding-*.pkl*', lclistpkl=None, cprenorm=False, nbrradiusarcsec=60.0, maxnumneighbors=5, makeneighborlcs=True, fast_mode=False, gaia_max_timeout=60.0, gaia_mirror=None, xmatchinfo=None, xmatchradiusarcsec=3.0, minobservations=99, sigclip=10.0, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, skipdone=False, done_callback=None, done_callback_args=None, done_callback_kwargs=None, maxobjects=None, nworkers=32) This drives the parallel execution of `runcp` for a directory of periodfinding pickles. Parameters ---------- pfpickledir : str This is the directory containing all of the period-finding pickles to process. outdir : str The directory the checkplot pickles will be written to. lcbasedir : str The base directory that this function will look in to find the light curves pointed to by the period-finding result files. If you're using `lcfnamelist` to provide a list of light curve filenames directly, this arg is ignored. pkpickleglob : str This is a UNIX file glob to select period-finding result pickles in the specified `pfpickledir`. lclistpkl : str or dict This is either the filename of a pickle or the actual dict produced by lcproc.make_lclist. This is used to gather neighbor information. cprenorm : bool Set this to True if the light curves should be renormalized by `checkplot.checkplot_pickle`. This is set to False by default because we do our own normalization in this function using the light curve's registered normalization function and pass the normalized times, mags, errs to the `checkplot.checkplot_pickle` function. nbrradiusarcsec : float The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity. maxnumneighbors : int The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object. makeneighborlcs : bool If True, will make light curve and phased light curve plots for all neighbors found in the object collection for each input object. fast_mode : bool or float This runs the external catalog operations in a "fast" mode, with short timeouts and not trying to hit external catalogs that take a long time to respond. If this is set to True, the default settings for the external requests will then become:: skyview_lookup = False skyview_timeout = 10.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False If this is a float, will run in "fast" mode with the provided timeout value in seconds and the following settings:: skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False gaia_max_timeout : float Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object's information. Note that if `fast_mode` is set, this is ignored. gaia_mirror : str or None This sets the GAIA mirror to use. This is a key in the `services.gaia.GAIA_URLS` dict which defines the URLs to hit for each mirror. xmatchinfo : str or dict This is either the xmatch dict produced by the function `load_xmatch_external_catalogs` above, or the path to the xmatch info pickle file produced by that function. xmatchradiusarcsec : float This is the cross-matching radius to use in arcseconds. minobservations : int The minimum of observations the input object's mag/flux time-series must have for this function to plot its light curve and phased light curve. If the object has less than this number, no light curves will be plotted, but the checkplotdict will still contain all of the other information. sigclip : float or int or sequence of two floats/ints or None If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series. If a list of two ints/floats is provided, the function will perform an 'asymmetric' sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, `sigclip=[10., 3.]`, will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of "dimming" and "brightening" is set by *physics* (not the magnitude system), which is why the `magsarefluxes` kwarg must be correctly set. If `sigclip` is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output. lcformat : str This is the `formatkey` associated with your light curve format, which you previously passed in to the `lcproc.register_lcformat` function. This will be used to look up how to find and read the light curves specified in `basedir` or `use_list_of_filenames`. lcformatdir : str or None If this is provided, gives the path to a directory when you've stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with `lcformat` to specify an LC format JSON file that's not currently registered with lcproc. timecols : list of str or None The timecol keys to use from the lcdict in generating this checkplot. magcols : list of str or None The magcol keys to use from the lcdict in generating this checkplot. errcols : list of str or None The errcol keys to use from the lcdict in generating this checkplot. skipdone : bool This indicates if this function will skip creating checkplots that already exist corresponding to the current `objectid` and `magcol`. If `skipdone` is set to True, this will be done. done_callback : Python function or None This is used to provide a function to execute after the checkplot pickles are generated. This is useful if you want to stream the results of checkplot making to some other process, e.g. directly running an ingestion into an LCC-Server collection. The function will always get the list of the generated checkplot pickles as its first arg, and all of the kwargs for runcp in the kwargs dict. Additional args and kwargs can be provided by giving a list in the `done_callbacks_args` kwarg and a dict in the `done_callbacks_kwargs` kwarg. NOTE: the function you pass in here should be pickleable by normal Python if you want to use it with the parallel_cp and parallel_cp_lcdir functions below. done_callback_args : tuple or None If not None, contains any args to pass into the `done_callback` function. done_callback_kwargs : dict or None If not None, contains any kwargs to pass into the `done_callback` function. maxobjects : int The maximum number of objects to process in this run. nworkers : int The number of parallel workers that will work on the checkplot generation process. Returns ------- dict This returns a dict with keys = input period-finding pickles and vals = list of the corresponding checkplot pickles produced.
cpl = parallel_cp_pfdir(
'period-finding', # the directory where all the period-finding result pickles are
'checkplots', # directory to where the output will go
'csvlcs', # the directory containing the light curves
lclistpkl='IC4725-catalog.pkl', # the object catalog pickle we made with neighbor info
nbrradiusarcsec=60.0, # maximum distance in arcsec to look for neighbors
xmatchinfo='xmatch-cats/IC4725-xmatch-catalogs.pkl', # xmatch pickle we made
xmatchradiusarcsec=3.0, # maximum cross-match distance to use in arcsec
sigclip=10.0, # sigclip to apply to LC
lcformat='hplc', # our custom LC format
nworkers=4 # number of parallel workers to launch
)
# This took about an hour on the same machine
# I ran it outside the notebook because it produces a lot of output
!ls checkplots/*.pkl | wc -l
605
This is an optional step, but can be useful when looking at collections of objects where we have distance information via parallaxes or if the objects are all in a stellar cluster (like this instance).
# We'll use lcproc.checkplotproc.colormagdiag_cpdir, which has the same args
# as colormagdiagram_cplist, so let's look at that first
from astrobase.lcproc.checkplotproc import colormagdiagram_cplist, colormagdiagram_cpdir
help(colormagdiagram_cplist)
Help on function colormagdiagram_cplist in module astrobase.lcproc.checkplotproc: colormagdiagram_cplist(cplist, outpkl, color_mag1=['gaiamag', 'sdssg'], color_mag2=['kmag', 'kmag'], yaxis_mag=['gaia_absmag', 'rpmj']) This makes color-mag diagrams for all checkplot pickles in the provided list. Can make an arbitrary number of CMDs given lists of x-axis colors and y-axis mags to use. Parameters ---------- cplist : list of str This is the list of checkplot pickles to process. outpkl : str The filename of the output pickle that will contain the color-mag information for all objects in the checkplots specified in `cplist`. color_mag1 : list of str This a list of the keys in each checkplot's `objectinfo` dict that will be used as color_1 in the equation:: x-axis color = color_mag1 - color_mag2 color_mag2 : list of str This a list of the keys in each checkplot's `objectinfo` dict that will be used as color_2 in the equation:: x-axis color = color_mag1 - color_mag2 yaxis_mag : list of str This is a list of the keys in each checkplot's `objectinfo` dict that will be used as the (absolute) magnitude y-axis of the color-mag diagrams. Returns ------- str The path to the generated CMD pickle file for the collection of objects in the input checkplot list. Notes ----- This can make many CMDs in one go. For example, the default kwargs for `color_mag`, `color_mag2`, and `yaxis_mag` result in two CMDs generated and written to the output pickle file: - CMD1 -> gaiamag - kmag on the x-axis vs gaia_absmag on the y-axis - CMD2 -> sdssg - kmag on the x-axis vs rpmj (J reduced PM) on the y-axis
# First, we'll generate a color-mag diagram for the entire light curve collection
# This operates on a directory of checkplots
cmd = colormagdiagram_cpdir(
'checkplots', # the directory where the checkplots are
'IC4725-colormagdiagram.pkl', # the output pickle to generate
# CMDs are generated using color_mag1[i] - color_mag2[i] as color and yaxis_mag[i] as mag
# the following will generate three CMDs:
# (1) gaiamag-kmag / gaia_absmag, (2) sdssg-kmag / rpmj, (3) gaiamag-kmag / gaiamag
# rpmj is the reduced proper motion and can be thought of as a proxy for absolute J mag
color_mag1=['gaiamag','sdssg','gaiamag'],
color_mag2=['kmag','kmag','kmag'],
yaxis_mag=['gaia_absmag','rpmj','gaiamag']
)
# cmd is a dict containing the colors and mags of the collection
# These can be plotted directly using the arrays
cmd.keys()
dict_keys(['objectids', 'mags', 'colors', 'color_mag1', 'color_mag2', 'yaxis_mag'])
cmd['mags']
array([[ nan, nan, 9.76958857], [ 1.52899762, nan, 10.27557022], [ nan, nan, 9.82191926], ..., [ nan, nan, nan], [ 4.35506667, nan, 8.91500279], [ -2.98332283, nan, 10.16701731]])
# Let's see how many objects have useful CMD info
print(np.isfinite(cmd['mags'][:,0]).sum()) # GAIA absolute mag
print(np.isfinite(cmd['mags'][:,1]).sum()) # reduced proper motion
print(np.isfinite(cmd['mags'][:,2]).sum()) # GAIA mag
53 0 557
# let's make a gaiamag-kmag / gaiamag CMD then
import matplotlib.pyplot as plt
plt.scatter(cmd['colors'][:,2], cmd['mags'][:,2],marker='.',s=3)
plt.xlim((0,5))
plt.ylim((17,8))
plt.xlabel(r'$G - K$')
plt.ylabel(r'$G$')
Text(0,0.5,'$G$')
# That looks mostly like giant stars and small main sequence
# The next step is to add the CMD to each checkplot
# this will overplot the object's position in the CMD
# so it stands out easily
# We'll use lcproc.checkplotproc.add_cmds_cpdir for this
from astrobase.lcproc.checkplotproc import add_cmds_cpdir
help(add_cmds_cpdir)
Help on function add_cmds_cpdir in module astrobase.lcproc.checkplotproc: add_cmds_cpdir(cpdir, cmdpkl, cpfileglob='checkplot*.pkl*', require_cmd_magcolor=True, save_cmd_pngs=False) This adds CMDs for each object in cpdir. Parameters ---------- cpdir : list of str This is the directory to search for checkplot pickles. cmdpkl : str This is the filename of the CMD pickle created previously. cpfileglob : str The UNIX fileglob to use when searching for checkplot pickles to operate on. require_cmd_magcolor : bool If this is True, a CMD plot will not be made if the color and mag keys required by the CMD are not present or are nan in each checkplot's objectinfo dict. save_cmd_pngs : bool If this is True, then will save the CMD plots that were generated and added back to the checkplotdict as PNGs to the same directory as `cpx`. Returns ------- Nothing.
x = add_cmds_cpdir(
'checkplots', # the directory for the checkplots
'IC4725-colormagdiagram.pkl', # the CMD pickle we made above
save_cmd_pngs=True # let's save the CMD for each object as a PNG as well
)
# Let's see what these CMDs look like
# first, let's look at a GAIA G - 2MASS K vs. GAIA absolute G CMD
Image('checkplots/cmd-HAT-577-0002140-gaiamag-kmag.gaia_absmag.png')
# this looks like a giant star...
# then for the same object, let's look at a GAIA G - 2MASS K vs. GAIA G CMD
Image('checkplots/cmd-HAT-577-0002140-gaiamag-kmag.gaiamag.png')
At this point, we're done with preparing our checkplots for review. Next, we'll fire up the checkplotserver to do so.
Before we can start, we need to generate a checkplot list JSON that the checkplotserver needs to be able to find checkplots. This is done by using the checkplotlist
script that comes with astrobase. This should be on your path, so you should be able to invoke it as below.
!ls
IC4725-catalog.pkl magseries-phased-plot.png IC4725-colormagdiagram.pkl notebook.tex IC4725-hatpi-observations-catalog.txt output_53_0.png checkplots output_55_0.png csvlcs period-finding lc-collection-work.ipynb pklcs lsp-plot.png xmatch-cats
# This invocation will generate a checkplot list JSON and sort the checkplots
# so that objects with smallest variability index eta (i.e. very probable variables)
# are near the top of the list. It will also filter out anything that doesn't have
# neighboring stars (a contrived example to show how neighboring stars show up in the
# checkplotserver interface).
# Filtering and sorting the checkplots like this can help keep the number
# of objects that need to be manually reviewed small enough to manage. Use
# checkplotlist --help to see other examples and details on sorting/filtering checkplots
!checkplotlist pkl checkplots --sortby 'varinfo.eta_normal|asc' --filterby 'objectinfo.neighbors|gt@0'
searching for checkplots: checkplots/*checkplot*.pkl found 605 checkplot files in dir: checkplots sorting checkplot pickles by varinfo.eta_normal in order: asc filtering checkplot pickles by ['objectinfo.neighbors'] using: ['gt@0'] retrieving checkplot info using 8 workers... filters applied: ['objectinfo.neighbors|gt@0'] -> objects found: 6 checkplot file list written to /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/objectinfo.neighbors_gt_0-varinfo.eta_normal_asc-checkplot-filelist.json
Looks like there are only 6 objects in this LC collection with neighbors (< 60 arcsec proximity)
# Now we can go ahead and start the checkplotserver
!checkplotserver --checkplotlist=objectinfo.neighbors_gt_0-varinfo.eta_normal_asc-checkplot-filelist.json
[I 180316 14:09:34 checkplotserver:152] using provided checkplot list file: objectinfo.neighbors_gt_0-varinfo.eta_normal_asc-checkplot-filelist.json [I 180316 14:09:34 checkplotserver:304] started checkplotserver. listening on http://127.0.0.1:5225 [I 180316 14:09:38 web:2063] 200 GET / (127.0.0.1) 14.16ms [I 180316 14:09:38 web:2063] 200 GET /list (127.0.0.1) 0.43ms [I 180316 14:09:38 checkplotserver_handlers:502] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0113042-iep.pkl... [I 180316 14:09:38 checkplotserver_handlers:524] loaded /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0113042-iep.pkl [I 180316 14:09:38 web:2063] 200 GET /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMTEzMDQyLWllcC5wa2w= (127.0.0.1) 184.04ms [I 180316 14:10:06 checkplotserver_handlers:924] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0113042-iep.pkl... [I 180316 14:10:06 checkplotserver_handlers:945] updated checkplot /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0113042-iep.pkl successfully [I 180316 14:10:06 web:2063] 200 POST /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMTEzMDQyLWllcC5wa2w= (127.0.0.1) 124.84ms [I 180316 14:10:06 checkplotserver_handlers:1097] wrote all changes to the checkplot filelist from the frontend for object: HAT-577-0113042 [I 180316 14:10:06 web:2063] 200 POST /list (127.0.0.1) 15.41ms [I 180316 14:10:06 checkplotserver_handlers:502] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0275323-iep.pkl... [I 180316 14:10:06 checkplotserver_handlers:524] loaded /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0275323-iep.pkl [I 180316 14:10:06 web:2063] 200 GET /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMjc1MzIzLWllcC5wa2w= (127.0.0.1) 132.83ms [I 180316 14:10:11 checkplotserver_handlers:924] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0275323-iep.pkl... [I 180316 14:10:11 checkplotserver_handlers:945] updated checkplot /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0275323-iep.pkl successfully [I 180316 14:10:11 web:2063] 200 POST /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMjc1MzIzLWllcC5wa2w= (127.0.0.1) 86.16ms [I 180316 14:10:11 checkplotserver_handlers:1097] wrote all changes to the checkplot filelist from the frontend for object: HAT-577-0275323 [I 180316 14:10:11 web:2063] 200 POST /list (127.0.0.1) 22.51ms [I 180316 14:10:11 checkplotserver_handlers:502] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0334488-iep.pkl... [I 180316 14:10:11 checkplotserver_handlers:524] loaded /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0334488-iep.pkl [I 180316 14:10:11 web:2063] 200 GET /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMzM0NDg4LWllcC5wa2w= (127.0.0.1) 126.80ms [I 180316 14:10:19 checkplotserver_handlers:924] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0334488-iep.pkl... [I 180316 14:10:19 checkplotserver_handlers:945] updated checkplot /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0334488-iep.pkl successfully [I 180316 14:10:19 web:2063] 200 POST /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMzM0NDg4LWllcC5wa2w= (127.0.0.1) 77.79ms [I 180316 14:10:19 checkplotserver_handlers:1097] wrote all changes to the checkplot filelist from the frontend for object: HAT-577-0334488 [I 180316 14:10:19 web:2063] 200 POST /list (127.0.0.1) 26.08ms [I 180316 14:10:19 checkplotserver_handlers:502] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl... [I 180316 14:10:19 checkplotserver_handlers:524] loaded /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl [I 180316 14:10:19 web:2063] 200 GET /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMDAyMzc0LWllcC5wa2w= (127.0.0.1) 141.63ms [I 180316 14:14:15 checkplotserver_handlers:924] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl... [I 180316 14:14:15 checkplotserver_handlers:945] updated checkplot /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl successfully [I 180316 14:14:15 web:2063] 200 POST /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMDAyMzc0LWllcC5wa2w= (127.0.0.1) 94.76ms [I 180316 14:14:15 checkplotserver_handlers:1097] wrote all changes to the checkplot filelist from the frontend for object: HAT-577-0002374 [I 180316 14:14:15 web:2063] 200 POST /list (127.0.0.1) 29.74ms [I 180316 14:14:15 checkplotserver_handlers:502] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002740-iep.pkl... [I 180316 14:14:15 checkplotserver_handlers:524] loaded /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002740-iep.pkl [I 180316 14:14:15 web:2063] 200 GET /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMDAyNzQwLWllcC5wa2w= (127.0.0.1) 134.01ms [I 180316 14:14:37 checkplotserver_handlers:924] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002740-iep.pkl... [I 180316 14:14:37 checkplotserver_handlers:945] updated checkplot /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002740-iep.pkl successfully [I 180316 14:14:37 web:2063] 200 POST /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMDAyNzQwLWllcC5wa2w= (127.0.0.1) 78.38ms [I 180316 14:14:37 checkplotserver_handlers:1097] wrote all changes to the checkplot filelist from the frontend for object: HAT-577-0002740 [I 180316 14:14:37 web:2063] 200 POST /list (127.0.0.1) 34.19ms [I 180316 14:14:37 checkplotserver_handlers:502] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl... [I 180316 14:14:37 checkplotserver_handlers:524] loaded /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl [I 180316 14:14:37 web:2063] 200 GET /cp/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMDAyMzc0LWllcC5wa2w= (127.0.0.1) 121.51ms [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = startp, wbkwarg = None [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = endp, wbkwarg = None [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = magsarefluxes, wbkwarg = 'false' [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = autofreq, wbkwarg = 'true' [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = stepsize, wbkwarg = None [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = nbestpeaks, wbkwarg = '10' [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = sigclip[], wbkwarg = None [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = lctimefilters, wbkwarg = '' [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = lcmagfilters, wbkwarg = '' [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = periodepsilon, wbkwarg = '0.1' [I 180316 14:15:34 checkplotserver_handlers:1294] xkwarg = nharmonics, wbkwarg = '6' [I 180316 14:15:34 checkplotserver_handlers:1379] loading /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl... [I 180316 14:15:34 checkplotserver_handlers:1428] forcereload = True [I 180316 14:15:34 checkplotserver_handlers:1496] psearch-mav [I 180316 14:15:34 checkplotserver_handlers:1497] [array([ 57170.64095, 57170.64139, 57170.64194, ..., 57274.63173, 57274.63222, 57274.63355]), array([-0.019855, -0.028955, -0.029025, ..., -0.0078 , -0.0401 , 0.01488 ]), array([ 0.0067 , 0.0064 , 0.00661, ..., 0.00619, 0.00625, 0.0063 ])] [I 180316 14:15:34 checkplotserver_handlers:1498] {'startp': None, 'endp': None, 'magsarefluxes': False, 'autofreq': True, 'stepsize': 0.0001, 'nbestpeaks': 10, 'sigclip': None, 'lctimefilters': '', 'lcmagfilters': '', 'periodepsilon': 0.1, 'nharmonics': 6} [I 180316 14:15:34 smav:39] using autofreq with 5195 frequency points, start P = 0.100, end P = 103.993 [I 180316 14:15:34 smav:39] using 8 workers... /Users/waqasbhatti/py36-venv/lib/python3.6/site-packages/numpy/core/numeric.py:531: ComplexWarning: Casting complex values to real discards the imaginary part return array(a, dtype, copy=False, order=order) /Users/waqasbhatti/py36-venv/lib/python3.6/site-packages/matplotlib/text.py:1742: ComplexWarning: Casting complex values to real discards the imaginary part y = float(self.convert_yunits(y)) [I 180316 14:15:46 lcfit:62] spline fit done. nknots = 30, chisq = 76897.29376, reduced chisq = 4.96849 [I 180316 14:15:46 checkplot:86] plotting mav phased LC with period -1: 5.148149, epoch: 57200.83565 [E 180316 14:15:48 checkplot:95] spline fit failed, trying SavGol fit [I 180316 14:15:48 lcfit:62] applying Savitzky-Golay filter with window length 51 and polynomial degree 2 to mag series with 15508 observations, using period 10.399260, folded at 57170.640950 /Users/waqasbhatti/py36-venv/lib/python3.6/site-packages/scipy/linalg/basic.py:1226: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver. warnings.warn(mesg, RuntimeWarning) [I 180316 14:15:48 lcfit:62] SG filter applied. chisq = 63435.02224, reduced chisq = -99.00000 [I 180316 14:15:48 checkplot:86] plotting mav phased LC with period -1: 10.399260, epoch: 57195.80004 [I 180316 14:15:49 lcfit:62] spline fit done. nknots = 30, chisq = 79607.69561, reduced chisq = 5.14361 [I 180316 14:15:49 checkplot:86] plotting mav phased LC with period -1: 15.293029, epoch: 57200.70458 [I 180316 14:15:51 checkplotserver_handlers:1898] saved temp results from psearch-mav to checkplot: /Users/waqasbhatti/astrowork/frames-and-lcs/hatpi-G577-projid12-IC4725-cluster-lcs/checkplots/checkplot-HAT-577-0002374-iep.pkl-cpserver-temp [I 180316 14:15:51 web:2063] 200 GET /tools/Y2hlY2twbG90cy9jaGVja3Bsb3QtSEFULTU3Ny0wMDAyMzc0LWllcC5wa2w=?objectid=HAT-577-0002374&lctool=psearch-mav&forcereload=true&magsarefluxes=false&autofreq=true&sigclip=&nbestpeaks=10&lctimefilters=&lcmagfilters=&periodepsilon=0.1&nharmonics=6 (127.0.0.1) 16566.13ms ^C Process Process-2: Process Process-1: [I 180316 14:17:24 checkplotserver:317] received Ctrl-C: shutting down... Traceback (most recent call last): Traceback (most recent call last): File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/process.py", line 169, in _process_worker call_item = call_queue.get(block=True) File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 94, in get res = self._recv_bytes() File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/process.py", line 169, in _process_worker call_item = call_queue.get(block=True) File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/queues.py", line 93, in get with self._rlock: File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/synchronize.py", line 96, in __enter__ return self._semlock.__enter__() File "/Users/waqasbhatti/mycode/astrobase/astrobase/cpserver/checkplotserver.py", line 40, in recv_sigint raise KeyboardInterrupt File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) File "/Users/waqasbhatti/mycode/astrobase/astrobase/cpserver/checkplotserver.py", line 40, in recv_sigint raise KeyboardInterrupt KeyboardInterrupt KeyboardInterrupt
Now let's look at some of the UI elements of the checkplotserver.
Image('/Users/waqasbhatti/Downloads/screencapture-127-0-0-1-5225-2018-03-16-14_13_36.png')
We can see that all of our prep work has made it safely to the checkplotserver for use in reviewing objects:
First, the "Object Overview" tab:
checkplotlist
arguments. Clicking on any of these will load that checkplot pickle.checkplotlist
. It is updated and saved every time a new checkplot is loaded, so you can come back to checkplotserver (even if you stop the process for whatever reason) after some time and resume where you left off by clicking on the last reviewed object. The current reviewed objects list can be exported to either CSV or JSON by clicking on the appropriate buttons. The filter select box at the bottom can be used to toggle between objects that have been marked as probable variables, maybe variables, not variables, or all objects in the collection to browse these subsets in isolation.If the object's CMDs are available, you can click on the links in the object information table, to bring up something like the screenshot below.
Image('/Users/waqasbhatti/Downloads/screencapture-127-0-0-1-5225-2018-03-16-14_17_02.png')
Here, you can see the object itself depicted as a gold star in the CMD.
Moving on, the "Phased LCs" tab (see screenshot below) shows all the phased light curves using the three best periods obtained by all the period-finder methods we applied. Clicking on any one of these tiles will set the Period and Epoch textboxes to reflect the period and epoch used for that plot.
Image('/Users/waqasbhatti/Downloads/screencapture-127-0-0-1-5225-2018-03-16-14_14_48.png')
The "Cross Matches" tab shows the result of the cross-matching to external catalogs we did earlier. Each external catalog will be listed along with the columns we extracted from it.
Image('/Users/waqasbhatti/Downloads/screencapture-127-0-0-1-5225-2018-03-16-14_15_10.png')
Finally, the "Period Search" tab can be used to rerun a period-search on the object. Going through the interface elements:
Image('/Users/waqasbhatti/Downloads/screencapture-127-0-0-1-5225-2018-03-16-14_15_57.png')