Astropy: Tables¶

Objectives

Create tables
Access data in tables
Combining tables
Aggregation
Masking
Reading/writing

Documentation¶

For more information about the features presented below, you can read the astropy.table docs.

Creating tables¶

In [ ]:

import numpy as np
from astropy.table import Table

In [ ]:

# Creating a table from scratch
t1 = Table()
t1['name'] = ['source 1', 'source 2', 'source 3']
t1['flux'] = [1.2, 2.2, 3.1]

In [ ]:

# Looking at the table
t1

In [ ]:

# Adding a column
t1['size'] = [1,5,4]
t1

In [ ]:

# Accessing a column
t1['size']

In [ ]:

# Converting to a Numpy array
np.array(t1['size'])

In [ ]:

# Accessing a cell
t1['size'][0]

In [ ]:

# Accessing a row
t1[0]

Units in tables¶

In [ ]:

# Set unit on column
t1['size'].unit = 'cm'
t1

Some unitful operations will then work:

In [ ]:

t1['size'].to('m')

However, you may run into unexpected behavior, so if you are planning on using table columns as Quantities, we recommend that you use the QTable class:

In [ ]:

type(t1['size'])

In [ ]:

from astropy.table import QTable
qt1 = QTable(t1)
type(qt1['size'])

Challenge

Make a table that contains three columns: spectral type, temperature, and radius, and incude 5 rows with fake data (or real data if you like, for example from here). Try including units on the columns that can have them.
Find the mean temperature and the maximum radius
Try and find out how to add and remove rows
Add a new column which gives the luminosity (using $L=4\pi R^2 \sigma T^4$)

In [ ]:

#1
from astropy import units as u
t = QTable()
t['spectral type'] = ['O5', 'B5', 'A5', 'F5', 'G5']
t['radius'] = [12, 3.9, 1.7, 1.3, 0.92] * u.R_sun
t['temperature'] = [45000, 15000, 8200, 6400, 5700] * u.K
t

In [ ]:

#2
print('Mean temperature:', np.mean(t['temperature']))
print('Maximum radius:', np.mean(t['radius']))

In [ ]:

#3
t.add_row({'spectral type': 'K5',
           'temperature': 4300 * u.K,
           'radius': 0.72 * u.R_sun})
t.remove_row(0)
t

In [ ]:

#4
from numpy import pi
from astropy.constants import sigma_sb
t['luminosity'] = (4 * pi * t['radius'] ** 2 * sigma_sb * t['temperature'] ** 4).to(u.L_sun)
t

Iterating over tables¶

It is possible to iterate over rows or over columns. To iterate over rows, simply iterate over the table itself:

In [ ]:

for row in t1:
    print(row)

Rows can act like dictionaries, so you can access specific columns from a row:

In [ ]:

for row in t1:
    print(row['name'])

Iterating over columns is also easy:

In [ ]:

for colname in t1.columns:
    column = t1[colname]
    print(column)

Accessing specific rows from a column object can also be done with the item notation:

In [ ]:

for colname in t1.columns:
    column = t1[colname]
    print(column[0])

Joining tables¶

In [ ]:

from astropy.table import join

In [ ]:

t2 = Table()
t2['name'] = ['source 1', 'source 3']
t2['flux2'] = [1,9]

In [ ]:

t3 = join(t1, t2, join_type='outer')
t3

In [ ]:

np.mean(t3['flux2'])

Masked tables¶

In [ ]:

t4 = Table(masked=True)
t4['name'] = ['source 1', 'source 2', 'source 3']
t4['flux'] = [1.2, 2.2, 3.1]

In [ ]:

t4['flux'].mask = [1,0,1]
t4

Slicing¶

Tables can be sliced like Numpy arrays:

In [ ]:

obs = Table.read("""name    obs_date    mag_b  mag_v
                    M31     2012-01-02  17.0   17.5
                    M31     2012-01-02  17.1   17.4
                    M101    2012-01-02  15.1   13.5
                    M82     2012-02-14  16.2   14.5
                    M31     2012-02-14  16.9   17.3
                    M82     2012-02-14  15.2   15.5
                    M101    2012-02-14  15.0   13.6
                    M82     2012-03-26  15.7   16.5
                    M101    2012-03-26  15.1   13.5
                    M101    2012-03-26  14.8   14.3
                    """, format='ascii')

In [ ]:

obs[1:4]

In [ ]:

obs[obs['mag_b'] > 16]

In [ ]:

obs['mag_b', 'mag_v']

Challenge

Starting from the obs table:

Make a new table that shows every other row, starting with the second row? (that is, the second, fourth, sixth, etc. rows).
Make a new table the only contains rows where name is M31

In [ ]:

#1
subset1 = obs[1::2]
subset1

In [ ]:

#2
subset2 = obs[obs['name'] == 'M31']
subset2

Grouping and Aggregation¶

In [ ]:

obs_by_name = obs.group_by('name')

In [ ]:

obs_by_name

In [ ]:

for group in obs_by_name.groups:
    print(group)
    print("")

In [ ]:

obs_by_name.groups.aggregate(np.mean)

Writing data¶

In [ ]:

obs.write('test.fits', overwrite=True)

In [ ]:

obs.write('test.vot', format='votable', overwrite=True)

Reading data¶

In [ ]:

t4 = Table.read('2mass.tbl', format='ascii.ipac')

In [ ]:

t4

Challenge

Using the t4 table above:

Make a plot that shows j_m-h_m on the x-axis, and h_m-k_m on the y-axis
Make a new table that contains the subset of rows where the j_snr, h_snr, and k_snr columns, which give the signal-to-noise-ratio in the J, H, and K band, are greater than 10, and try and show these points in red in the plot you just made.
Make a new table (based on the full table) that contains only the RA, Dec, and the j_m, h_m and k_m columns, then try and write out this catalog into a format that you can read into another software package. For example, try and write out the catalog into CSV format, then read it into a spreadsheet software package (e.g. Excel, Google Docs, Numbers, OpenOffice). You may run into an issue at this point - if so, take a look at https://github.com/astropy/astropy/issues/7357 to see how to fix it.

In [ ]:

#1
import matplotlib.pyplot as plt
plt.scatter(t4['j_m'] - t4['h_m'], t4['h_m'] - t4['k_m'], )

In [ ]:

#2
subset = t4[(t4['j_snr'] > 10) & (t4['h_snr'] > 10) & (t4['k_snr'] > 10)]
subset

In [ ]:

#2 (continued)
import matplotlib.pyplot as plt
plt.scatter(t4['j_m'] - t4['h_m'],
            t4['h_m'] - t4['k_m'],
            s=5, color='black')
plt.scatter(subset['j_m'] - subset['h_m'],
            subset['h_m'] - subset['k_m'],
            s=30, color='red', alpha=0.5)

In [ ]:

#3
simple = t4['ra', 'dec', 'j_m', 'h_m', 'k_m']
simple

In [ ]:

#3 (continued)
simple.write('2mass_subset.csv', format='ascii.csv', overwrite=True, comment='#')