HSC Image Differencing

DiaSource counts per CCD, Visit and Night

Like https://github.com/lsst-dm/ap_pipe-notebooks/blob/tickets/DM-20187/false_positives/DiaSourceCensusCcdVisitNight.ipynb, this is an exploration of diaSources per visit/CCD/Visit but for the HSC RC2 dataset.

DiaSources can be counted a few different ways:

  • All DiaSource Rows
  • All DiaSource Rows with no pixel or measurement flags

We assume almost every diaSource is a false positive and therefore DiaSource count is a metric for image subtraction performace. Better perfomance corresponds to lower counts.

Image differencing in the stack

1) Data

We use /datasets/hsc/repo/rerun/private/yusra/RC2/ap_pipe/w_2019_26_bgSub HSC-I band, run for tract=9813, the COSMOS ultra deep field. PDR1 includes 33 "good" visits that we run regularly for our RC2 monthly continuous integration testing. They have the same pointing with small dithers.

Templates were the 11 best seeing images of the 33 visits: 19658, 19660, 19662, 19680, 19682, 19684, 19694, 19696, 30490, 30500, 30502

Visits differenced include: 1228, 1230, 1232, 1238, 1240, 1242, 1244, 1246, 1248, 19658, 19660, 19662, 19680, 19682, 19684, 19694, 19696, 19698, 19708, 19710, 19712, 30482, 30484, 30486, 30488, 30490, 30492, 30494, 30496, 30498, 30500, 30502, 30504

A complete description of the processing steps can be found at: https://github.com/lsst-dm/ap_pipe-notebooks/blob/master/false_positives/ap_pipe_HSC_RC2.md

Relevant configs:

  • default: config.differencer.doDecorrelation=True
  • default: config.differencer.doPreConvolve=False
  • change: config.differencer.subtract['al'].kernel.active.spatialBgOrder = 2
  • change: config.differencer.doAddCalexpBackground = False
In [1]:
templateVisits = [19658, 19660, 19662, 19680, 19682, 19684, 19694, 19696, 30490, 30500, 30502]
In [2]:
from IPython.display import Image
Image("https://lsst-web.ncsa.illinois.edu/~yusra/deblender-sprint/w_2019_18_4k/logs/9813.png", width=1000)
Out[2]:

HSC CCD Layout

Load and Transform Data

In [3]:
import os
import numpy as np
import scipy
import matplotlib.pyplot as plt
import pandas as pd
import sqlite3
import lsst.daf.persistence as dafPersist
import lsst.afw.display as afwDisplay
import lsst.geom
afwDisplay.setDefaultBackend('matplotlib')
import matplotlib.ticker as plticker
import lsst.afw.cameraGeom.testUtils as testUtils
import lsst.afw.cameraGeom as cameraGeom
from lsst.obs.hsc import HscMapper as mapper
In [4]:
# Most Recent Hits2015 rerun with default tasks and configs
REPO = '/datasets/hsc/repo/rerun/private/yusra/RC2/ap_pipe/w_2019_26_bgSub'

# Connect to the PPDB
connection = sqlite3.connect(f'{REPO}/association.db')
# Make a butler
butler = dafPersist.Butler(REPO)
In [5]:
# get columns
columns = pd.read_sql_query('select * from DiaSource limit 1', connection).columns
In [6]:
# get diaSources (ALL)
ds = pd.read_sql_query("""select ccdVisitId, ra, decl, psFlux, psFluxErr, midPointTai, 
x, y, ixxPSF, iyyPSF, ixyPSF, flags, snr, apFlux, apFluxErr, 
dipMeanFlux, dipMeanFluxErr, dipFluxDiff, 
dipFluxDiffErr, dipMeanFlux_dipFluxDiff_Cov, 
totFlux, totFluxErr, diffFlux, diffFluxErr 
from DiaSource where filterId=4""", connection)
In [7]:
columns
Out[7]:
Index(['diaSourceId', 'ccdVisitId', 'diaObjectId', 'ssObjectId',
       'parentDiaSourceId', 'prv_procOrder', 'ssObjectReassocTime',
       'midPointTai', 'ra', 'raErr',
       ...
       'ixxPSF', 'iyyPSF', 'ixyPSF', 'extendedness', 'spuriousness', 'flags',
       'pixelId', 'filterName', 'filterId', 'isDipole'],
      dtype='object', length=111)
In [8]:
# create some derived columns

ds['visit'] = ds.ccdVisitId.apply(lambda x: int(x/200.))
ds['ccd'] = ds.ccdVisitId.apply(lambda x: int(np.round((x/200. - int(x/200.))*200)))

SIGMA2FWHM = 2*np.sqrt(2*np.log(2))
PIXEL_SCALE = 0.168
traceRadius = np.sqrt(0.5) * np.sqrt(ds.ixxPSF + ds.iyyPSF)
ds['seeing'] = SIGMA2FWHM * traceRadius
In [9]:
# get Focal Plane Coordinates (is there a faster way to do this?)
camera = mapper(root=REPO).camera

def ccd2focalPlane(x, y, ccd):
    cc = int(ccd)
    det = camera[cc]
    point = det.transform(lsst.geom.Point2D(x, y), cameraGeom.PIXELS, cameraGeom.FOCAL_PLANE)
    return point[0], point[1]

vecCcd2FP =  np.vectorize(ccd2focalPlane)
focal_plane = vecCcd2FP(ds['x'].values.astype(float), ds['y'].values.astype(float), ds['ccd'].values.astype(int))
ds['xFP'] = focal_plane[0]
ds['yFP'] = focal_plane[1]
In [10]:
visits = np.unique(ds.visit)
goodCcds = np.unique(ds.ccd[(ds.xFP > -8000) & (ds.xFP < 8000) & (ds.yFP > -10000) & (ds.yFP < 10000)])
bbox = butler.get('calexp_bbox', visit = int(visits[0]), ccd = 49)
APPROX_CCD_AREA = PIXEL_SCALE*PIXEL_SCALE*bbox.getArea()/3600/3600
APPROX_N_CCD = 92
N_GOOD_CCD = len(goodCcds) # 54
APPROX_VISIT_AREA = APPROX_CCD_AREA * APPROX_N_CCD
GOOD_VISIT_AREA = APPROX_CCD_AREA * N_GOOD_CCD
In [11]:
# Get flags
from lsst.ap.association import UnpackPpdbFlags, MapDiaSourceConfig
config = MapDiaSourceConfig()
unpacker = UnpackPpdbFlags(config.flagMap, 'DiaSource')
flag_values = unpacker.unpack(ds['flags'], 'flags')
flag_names = list(flag_values.dtype.names)
flagDF = pd.DataFrame(flag_values)
flagDF.head()
/software/lsstsw/stack_20190330/stack/miniconda3-4.5.12-1172c30/Linux64/ap_association/17.0.1-6-ga2de75c+7/python/lsst/ap/association/mapApData.py:388: YAMLLoadWarning: calling yaml.load_all() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  table_list = list(yaml.load_all(yaml_stream))
Out[11]:
base_PixelFlags_flag base_PixelFlags_flag_offimage base_PixelFlags_flag_edge base_PixelFlags_flag_interpolated base_PixelFlags_flag_saturated base_PixelFlags_flag_cr base_PixelFlags_flag_bad base_PixelFlags_flag_suspect base_PixelFlags_flag_interpolatedCenter base_PixelFlags_flag_saturatedCenter ... slot_PsfFlux_flag_edge ip_diffim_forced_PsfFlux_flag ip_diffim_forced_PsfFlux_flag_noGoodPixels ip_diffim_forced_PsfFlux_flag_edge slot_Shape_flag slot_Shape_flag_unweightedBad slot_Shape_flag_unweighted slot_Shape_flag_shift slot_Shape_flag_maxIter slot_Shape_flag_psf
0 False False False True True True False True True True ... False False False False True True False False False False
1 False False True True True True False True True True ... False False False False True True False False False False
2 False False False True False True False False True False ... False False False False True True False False False False
3 False False False True False True False False True False ... False False False False True True False False False False
4 False False False True False True False False True False ... False False False False True True False False False False

5 rows × 29 columns

In [12]:
pixelFlagNames = ['base_PixelFlags_flag',
 'base_PixelFlags_flag_offimage',
 'base_PixelFlags_flag_edge',
 'base_PixelFlags_flag_interpolated',
 'base_PixelFlags_flag_saturated',
 'base_PixelFlags_flag_cr',
 'base_PixelFlags_flag_bad',
 'base_PixelFlags_flag_suspect',
 'base_PixelFlags_flag_interpolatedCenter',
 'base_PixelFlags_flag_saturatedCenter',
 'base_PixelFlags_flag_crCenter',
 'base_PixelFlags_flag_suspectCenter',]
centroidFlagNames = ['slot_Centroid_flag',
 'slot_Centroid_pos_flag',
 'slot_Centroid_neg_flag',]
fluxFlagNames = ['slot_ApFlux_flag',
 'slot_ApFlux_flag_apertureTruncated',
 'slot_PsfFlux_flag',
 'slot_PsfFlux_flag_noGoodPixels',
 'slot_PsfFlux_flag_edge',
 'ip_diffim_forced_PsfFlux_flag',
 'ip_diffim_forced_PsfFlux_flag_noGoodPixels',
 'ip_diffim_forced_PsfFlux_flag_edge',]
shapeFlagNames =  ['slot_Shape_flag',
 'slot_Shape_flag_unweightedBad',
 'slot_Shape_flag_unweighted',
 'slot_Shape_flag_shift',
 'slot_Shape_flag_maxIter',
 'slot_Shape_flag_psf']

badFlagNames = ['base_PixelFlags_flag_saturated', 
                'base_PixelFlags_flag_suspect',
                'base_PixelFlags_flag_bad',
                'slot_Shape_flag']
In [13]:
def plot2axes(x1, y1, x2, y2, xlabel, ylabel1, ylabel2):
    fig, ax1 = plt.subplots(figsize=(16,6))
    color = 'tab:red'
    ax1.set_xlabel(xlabel)
    ax1.set_ylabel(ylabel1, color=color)
    a = np.arange(len(x1))
    ax1.plot(a, y1, color=color)
    ax1.xaxis.set_ticks(a) #set the ticks to be a
    ax1.xaxis.set_ticklabels(x1)
    
    ax1.tick_params(axis='y', labelcolor=color)

    ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis

    color = 'tab:blue'
    ax2.set_ylabel(ylabel2, color=color)  # we already handled the x-label with ax1
    ax2.plot(a, y2, color=color)
    ax2.tick_params(axis='y', labelcolor=color)
    for label in ax1.get_xticklabels():
        label.set_ha("right")
        label.set_rotation(90)

    fig.tight_layout()  # otherwise the right y-label is slightly clipped
    plt.show()

2) DiaSources per visit

A few different ways to look at false positives as a function of visits. Most of the variance in diaSources can be explained by the seeing. If the visit PSF is smaller than the template PSF, the template PSF is deconvolved, in which any inaccuracy in the PSF model is amplified. This causes the clear relationship below where false positives improve as the PSF size increases:

In [14]:
idx = flagDF[badFlagNames].sum(axis = 1) == 0 
idxGoodCcd = ds.ccd.isin(goodCcds)
idxHighSNR = idx & idx & (ds.dipMeanFlux/ds.dipMeanFluxErr > 5)
In [15]:
vgrp = ds.groupby('visit')
vgrpClean = ds[idx].groupby('visit')
vgrpCleanCenter = ds[idx & idxGoodCcd].groupby('visit')
vgrpHighSNR = ds[idxHighSNR].groupby('visit')

plot2axes(vgrp.visit.first().values, vgrp.ccd.count().values/APPROX_VISIT_AREA,
          vgrp.visit.first().values, vgrp.seeing.median().values,
          'visit','N diaSources All (per sq deg)',
          'median FWHM per ixx/iyy (pixels)')

plot2axes(vgrpClean.visit.first().values, vgrpClean.ccd.count().values/APPROX_VISIT_AREA,
          vgrpClean.visit.first().values, vgrpClean.seeing.median().values,
          'visit','N diaSources No SAT, BAD, SUSPECT, shape Flag (per sq deg)',
          'median FWHM per ixx/iyy (pixels)')

plot2axes(vgrpCleanCenter.visit.first().values, vgrpCleanCenter.ccd.count().values/GOOD_VISIT_AREA,
          vgrpCleanCenter.visit.first().values, vgrpCleanCenter.seeing.median().values/GOOD_VISIT_AREA,
          'visit','N diaSources No SAT, BAD, SUSPECT, shape Flag (per sq deg)',
          'median FWHM per ixx/iyy (pixels)')

plot2axes(vgrpHighSNR.visit.first().values, vgrpHighSNR.ccd.count().values/APPROX_VISIT_AREA,
          vgrpHighSNR.visit.first().values, vgrpHighSNR.seeing.median().values,
          'visit','N diaSources No SAT, BAD, SUSPECT, shape Flag AND SNR > 5(per sq deg)',
          'median FWHM per ixx/iyy (pixels)')
In [16]:
templateSeeingDist = ds.seeing[ds.visit.isin(templateVisits)]
plt.figure(figsize=(8,8))
plt.plot(vgrp.seeing.median().values, vgrp.ccd.count().values/APPROX_VISIT_AREA, '.', label="ALL")
plt.yscale('log')
plt.ylabel('diaSources ALL (per sq deg)')
plt.xlabel('Median FWHM (pixels)')
plt.tight_layout()
In [17]:
def n(v):
    return v*np.exp(-0.5*v*v)/(2**(5/2) * np.pi**(3/2))

def expected(low, high, seeing=3, nCcds=len(visits)*APPROX_N_CCD):
    """seeing must be supplied as sigma)
    """
    inBetween = n(low) - n(high)
    return inBetween * 2 * nCcds * bbox.getWidth()*bbox.getHeight() / (seeing**2)

seeingRange = np.linspace(np.min(vgrp.seeing.median().values/SIGMA2FWHM), 
                   np.max(vgrp.seeing.median().values/SIGMA2FWHM),
                   20)
expectedCounts = expected(5, 100, seeingRange)/ 33 / APPROX_VISIT_AREA
In [18]:
templateSeeingDist = ds.seeing[ds.visit.isin(templateVisits)]
plt.figure(figsize=(8,4))
ax = plt.subplot(111)
plt.yscale('log')
plt.ylabel('N diaSources (per sq deg)')
plt.xlabel('Median FWHM (pixels)')
plt.plot(vgrpClean.seeing.median().values, vgrpClean.ccd.count().values/APPROX_VISIT_AREA, '.', label="No SAT,BAD,SUSPECT Shape Flag")
plt.plot(vgrpHighSNR.seeing.median().values, vgrpHighSNR.ccd.count().values/APPROX_VISIT_AREA, '.', label="No SAT,BAD,SUSPECT Shape Flag SNR>5")
plt.vlines(templateSeeingDist.mean(), 20, 1000, linestyles='dashed', label="mean template", linewidth=1)
plt.vlines(templateSeeingDist.max(), 20, 1000, linestyles='solid', label="max template", linewidth=1)
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.9, box.height])
plt.plot(seeingRange*SIGMA2FWHM, expectedCounts, label="Expected Counts SNR > 5")
plt.legend(fontsize=8, loc='center left', bbox_to_anchor=(1, 0.5))
plt.tight_layout()

DiaSources per CCD

In [19]:
vgrp = ds.groupby('ccd')
vgrpClean = ds[idx].groupby('ccd')
vgrpHighSNR = ds[idxHighSNR].groupby('ccd')
NVisits = len(np.unique(ds['visit'].values))
In [20]:
# Get corners of the ccds. You can prob do this with camera geom too. 
cornerList = []
v = ds['visit'][0]
for ccd in range(0, 104):
    try:
        bbox = butler.get('calexp_bbox', visit=int(v), ccd=ccd)
    except:
        print(ccd)
        continue
    cornerList.append( [int(v), ccd] + [val for pair in bbox.getCorners() for val in ccd2focalPlane(pair[0], pair[1], ccd)])


corners = pd.DataFrame(cornerList)
cornerList
corners['width'] = corners[6] - corners[8]
corners['height'] = corners[7] - corners[5]
9
In [21]:
import matplotlib.patches as patches
import matplotlib as mpl
cmap = mpl.cm.magma_r

for grp, title in [(vgrp, "ALL"),
                   (vgrpClean, "No SAT, BAD, SUSPECT, shape Flags"),
                   (vgrpHighSNR, "No SAT, BAD, SUSPECT, shape Flags and SNR > 5")]:
    yy = grp.visit.count().values/NVisits/APPROX_CCD_AREA
    vmin, vmax = np.percentile(yy, (2, 98))
    norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax)


    fig1 = plt.figure(figsize=(12,12))
    ax1 = fig1.add_subplot(111, aspect='equal')
    for index, row in corners.iterrows():
        try:
            avgFP = grp.get_group(int(row[1])).x.count()/NVisits/APPROX_CCD_AREA
        except KeyError:
            avgFP = 0
        #print(row[1], avgFP)
        ax1.add_patch(
        patches.Rectangle((row[7], row[6]), - row.height, -row.width, fill=True,
                          color=cmap(norm(avgFP))))
        ax1.text(row[7]-row.height/2, row[6]-row.width/2, '%d'%(row[1]), fontsize=14)
        plt.plot(row[7]-row.height/2, row[6]-row.width/2, ',')
    ax1.set_title('Mean DiaSource count (1/sq. deg.) \n %s' %(title))
    ax1.set_xlabel('X')
    ax1.set_ylabel('Y')
    ax1 = plt.gca()
    ax1.invert_yaxis()
   # ax1.invert_xaxis()
    sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
    sm.set_array([])
    plt.colorbar(sm)
    plt.show()
In [22]:
import matplotlib.patches as patches

fig1 = plt.figure(figsize=(12,12))
ax1 = fig1.add_subplot(111, aspect='equal')
for index, row in corners.iterrows():
    #print(row[2], row[3], -row.width, -row.height, row[2]-row.width/2, row[3]-row.height/2, '%d'%(row[1]))
    ax1.add_patch(
    patches.Rectangle((row[7], row[6]), - row.height, -row.width, fill=False))
    ax1.text(row[7]-row.height/2, row[6]-row.width/2, '%d'%(row[1]), fontsize=14)
    plt.plot(row[7]-row.height/2, row[6]-row.width/2, ',')
ax1.hexbin(ds['yFP'], ds['xFP'], gridsize=(400,400), bins='log', cmap='magma_r')
ax1.set_title('DiaSource Density In Focal Plane Coordinates ALL')
ax1.set_xlabel('X')
ax1.set_ylabel('Y')
ax1 = plt.gca()
ax1.invert_yaxis()
plt.show()
    
for ind, title in [
                   (idx, "No SAT, BAD, SUSPECT, shape Flags"),
                   (idxHighSNR, "No SAT, BAD, SUSPECT, shape Flags and SNR > 5")]:

    fig1 = plt.figure(figsize=(12,12))
    ax1 = fig1.add_subplot(111, aspect='equal')
    for index, row in corners.iterrows():
        #print(row[2], row[3], -row.width, -row.height, row[2]-row.width/2, row[3]-row.height/2, '%d'%(row[1]))
        ax1.add_patch(
        patches.Rectangle((row[7], row[6]), - row.height, -row.width, fill=False))
        ax1.text(row[7]-row.height/2, row[6]-row.width/2, '%d'%(row[1]), fontsize=14)
        plt.plot(row[7]-row.height/2, row[6]-row.width/2, ',')
    ax1.hexbin(ds[idx]['yFP'], ds[idx]['xFP'], gridsize=(400,400), bins='log', cmap='magma_r')
    ax1.set_title('DiaSource Density In Focal Plane Coordinates \n %s' %(title))
    ax1.set_xlabel('X')
    ax1.set_ylabel('Y')
    ax1 = plt.gca()
    ax1.invert_yaxis()
    plt.show()