Installing "standalone" Sherpa

For installation instructions, please follow the Sherpa standalone documentation page. The notebook was done with using the 4.8.2 release of Sherpa, and relies on having some form of anaconda installed (either the full installation or Miniconda).

To make sure I don't overwrite any existing Python set up, I created a separate environment - which I called notebook482 - for the installation (this step is not needed, but makes experimenting a bit easier). I also decided to install a few useful packages (versions shown below), and note that (new to this release), I can use Python 3.5:

% conda create -n=notebook482 -c sherpa sherpa python=3.5 ipython-notebook astropy matplotlib scipy
Fetching package metadata .........
Solving package specifications: ..........

...
    astropy:          1.2.1-np111py35_0       
    ipython-notebook: 4.0.4-py35_0            
    matplotlib:       1.5.1-np111py35_0
    scipy:            0.18.1-np111py35_0
    sherpa:           4.8.2-py35_2      sherpa
...
#
# To activate this environment, use:
# > source activate notebook482
#
# To deactivate this environment, use:
# > source deactivate notebook482
#

Since I do what I'm told, I follow this with

% source activate notebook482
% git clone https://github.com/DougBurke/sherpa-standalone-notebooks
% cd sherpa-standalone-notebooks
% jupyter notebook

and a page containing the following is now visible in my web browser:

A list of notebooks is given, one of which is "simple sherpa fit"

Selecting the "simple sherpa fit" option creates a new page containing this (user-editable) notebook, and the initial page changes to show that a notebook is running:

The "simple sherpa fit" notebook is now marked as "running"

The Sherpa UI

As Sherpa is a set of routines, there are several ways to use it. The main ones are

1 using the sherpa.astro.ui module, which provides data management capabilities on top of the fitting routines

2 a low-level API where for those times when the UI layer is not needed.

We start with a brief exploration of the UI layer, using it to replicate the simple one-dimensional polynomial fits from the Python for Astronomers "Introduction to NumPy and SciPy" guide. Although it is more involved than the original version, which uses the NumPy polyfit and polyval routines to fit a polynomial, it is not that much more work, and the advantage is that it is easy to change the model to fit a different, perhaps more-complicated, model.

We then follow with a run through of the same fit, this time using the lower-level API.

To start with - and as a check that everything is okay - I load the Sherpa ui module:

In [1]:
from sherpa.astro import ui
WARNING: failed to import sherpa.astro.xspec; XSPEC models will not be available

Since Sherpa has several optional parts, warning messages will be displayed when some of the parts are not available. This can mean a problem in the configuration, but in this case the Sherpa package installed by conda does not include support for XSPEC models, which is not a concern for this notebook.

Note that in earlier versions of this notebook I used the following commands to reduce the amount of repeated information displayed by some Sherpa commands; this is no-longer needed with recent IPython notebooks (I am not sure exactly when this happened), so is no-longer needed:

import logging
import sys
logging.getLogger('sherpa').propagate = 0
sys.tracebacklimit = None

For this notebook I need to download and unpack the data from the Python for Astronomers site; the following code is taken from that page (if you have problems following along please review the page to see if the instructions have changed since I last updated this notebook).

In [2]:
from astropy.extern.six.moves.urllib import request
import tarfile
url = 'http://python4astronomers.github.io/_downloads/core_examples.tar'
tarfile.open(fileobj=request.urlopen(url), mode='r|').extractall()

The data is in a subdirectory, so I take advantage of the IPython support for basic shell commands and move there:

In [3]:
cd py4ast/core
/home/djburke/sherpa/sherpa-standalone-notebooks/py4ast/core

Now I can read in the data using the AstroPy FITS reader:

In [4]:
from astropy.io import fits
hdus = fits.open('3c120_stis.fits.gz')

Please see the original guide for information on what the following is doing:

In [5]:
primary = hdus[0].data  # Primary (NULL) header data unit
img = hdus[1].data      # Intensity data
err = hdus[2].data      # Error per pixel
dq = hdus[3].data       # Data quality per pixel

The following statement makes the matplotlib plots appear inline (and sets up the plt namespace):

In [6]:
%pylab inline
Populating the interactive namespace from numpy and matplotlib
In [7]:
plt.imshow(img, origin = 'lower', vmin = -10, vmax = 65)
plt.colorbar()
Out[7]:
<matplotlib.colorbar.Colorbar at 0x7f56c28fe1d0>

It is this image that we are going to clean of cosmic rays (using SciPy) and to subtract a simple estimate of the background (using Sherpa).

In [8]:
profile = img.sum(axis=1)
plt.figure()
plt.plot(profile)
Out[8]:
[<matplotlib.lines.Line2D at 0x7f5694618390>]
In [9]:
spectrum = img.sum(axis=0)
plt.figure()
plt.plot(spectrum)
Out[9]:
[<matplotlib.lines.Line2D at 0x7f5694605860>]

Cosmic rays are removed using a simple median-filter approach:

In [10]:
import scipy.signal
img_sm = scipy.signal.medfilt(img, 5)
sigma = median(err)
bad = np.abs(img - img_sm) / sigma > 8.0
img_cr = img.copy()
img_cr[bad] = img_sm[bad]
img_cr[230:280,:] = img[230:280,:]  # Filter only for background
In [11]:
x = append(np.arange(10, 200), np.arange(300, 480))  # Background rows
y = img_cr[x, 10]         # Background rows of column 10 of cleaned image

I now "load" the data for the column into Sherpa, as dataset 1 (the ui layer of Sherpa handles data management, using identifiers to refer to a particular dataset). The load_arrays command is the only time here in which we will use the dataset identifier, since the default is to either use dataset 1 or to use all loaded datasets (depending on the command).

Since we do not have errors here, we switch to the least-squares statistic, and then plot up the data. Since Sherpa really wants to plot errors, it creates some for us and tells us to ignore them; Ive never been happy with this behavior!

In [12]:
ui.load_arrays(1, x, y)
ui.set_stat('leastsq')
ui.plot_data()
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with leastsq

The original guide used a simple polynomial model to fit the data, so I will do the same. The syntax ui.<modelname>.<instance name> - e.g. ui.polynom1d.mdl - creates a variable called <instance name> (so mdl here) which is an instance of the model class (in this case a one-dimensional polynomial). It lets us "name" models, which can be useful in more-complicated scenarios, where the data has to be modelled as a combination of models. The ui.set_source call associates a model expression with a dataset (if not given it defaults to the default dataset).

In [13]:
ui.set_source(ui.polynom1d.mdl)

The model instance can be printed out, which shows its parameters:

In [14]:
print(mdl)
polynom1d.mdl
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   mdl.c0       thawed            1 -3.40282e+38  3.40282e+38           
   mdl.c1       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c2       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c3       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c4       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c5       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c6       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c7       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c8       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.offset   frozen            0 -3.40282e+38  3.40282e+38           

Model documentation is now available

One addition in the 4.8.2 release is that the model documentation is now included, so help(<instance name>) should now provide information on what the parameters means (as well as a lot of other information!).

In [15]:
help(mdl)
Help on Polynom1D in module sherpa.models.basic object:

class Polynom1D(sherpa.models.model.ArithmeticModel)
 |  One-dimensional polynomial function of order 8.
 |  
 |  The maximum order of the polynomial is 8. The default setting has
 |  all parameters frozen except for ``c0``, which means that the
 |  model acts as a constant.
 |  
 |  Attributes
 |  ----------
 |  c0
 |      The constant term.
 |  c1
 |      The amplitude of the (x-offset) term.
 |  c2
 |      The amplitude of the (x-offset)^2 term.
 |  c3
 |      The amplitude of the (x-offset)^3 term.
 |  c4
 |      The amplitude of the (x-offset)^4 term.
 |  c5
 |      The amplitude of the (x-offset)^5 term.
 |  c6
 |      The amplitude of the (x-offset)^6 term.
 |  c7
 |      The amplitude of the (x-offset)^7 term.
 |  c8
 |      The amplitude of the (x-offset)^8 term.
 |  offset
 |      There is a degeneracy between ``c0`` and ``offset``, so it
 |      is recommended that at least one of these remains frozen.
 |  
 |  
 |  See Also
 |  --------
 |  PowLaw1D, Polynom2D
 |  
 |  Notes
 |  -----
 |  The functional form of the model for points is::
 |  
 |      f(x) = sum_(i=0)^(i=8) c_i * (x - offset)^i
 |  
 |  and for an integrated grid it is the integral of this over
 |  the bin.
 |  
 |  Method resolution order:
 |      Polynom1D
 |      sherpa.models.model.ArithmeticModel
 |      sherpa.models.model.Model
 |      sherpa.utils.NoNewAttributesAfterInit
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name='polynom1d')
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  calc(cls, pars, xlo, *args, **kwargs)
 |  
 |  guess(self, dep, *args, **kwargs)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sherpa.models.model.ArithmeticModel:
 |  
 |  __abs__ = func(self)
 |  
 |  __add__ = func(self, rhs)
 |  
 |  __div__ = func(self, rhs)
 |  
 |  __floordiv__ = func(self, rhs)
 |  
 |  __getitem__(self, filter)
 |  
 |  __mod__ = func(self, rhs)
 |  
 |  __mul__ = func(self, rhs)
 |  
 |  __neg__ = func(self)
 |  
 |  __pow__ = func(self, rhs)
 |  
 |  __radd__ = rfunc(self, lhs)
 |  
 |  __rdiv__ = rfunc(self, lhs)
 |  
 |  __rfloordiv__ = rfunc(self, lhs)
 |  
 |  __rmod__ = rfunc(self, lhs)
 |  
 |  __rmul__ = rfunc(self, lhs)
 |  
 |  __rpow__ = rfunc(self, lhs)
 |  
 |  __rsub__ = rfunc(self, lhs)
 |  
 |  __rtruediv__ = rfunc(self, lhs)
 |  
 |  __setstate__(self, state)
 |  
 |  __sub__ = func(self, rhs)
 |  
 |  __truediv__ = func(self, rhs)
 |  
 |  apply(self, outer, *otherargs, **otherkwargs)
 |  
 |  startup(self)
 |  
 |  teardown(self)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sherpa.models.model.Model:
 |  
 |  __call__(self, *args, **kwargs)
 |      Call self as a function.
 |  
 |  __getattr__(self, name)
 |      # Make parameter access case insensitive
 |  
 |  __iter__(self)
 |      # This allows all models to be used in iteration contexts, whether or
 |      # not they're composite
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  __setattr__(self, name, val)
 |      Implement setattr(self, name, value).
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  get_center(self)
 |  
 |  reset(self)
 |  
 |  set_center(self, *args, **kwargs)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from sherpa.models.model.Model:
 |  
 |  thawedparhardmaxes
 |  
 |  thawedparhardmins
 |  
 |  thawedparmaxes
 |  
 |  thawedparmins
 |  
 |  thawedpars
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sherpa.utils.NoNewAttributesAfterInit:
 |  
 |  __delattr__(self, name)
 |      Implement delattr(self, name).
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from sherpa.utils.NoNewAttributesAfterInit:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

Note that the Type field above has thawed only for one parameter - c0 - which is not sufficient here, as we want to fit the slope and not just the overall normalization. We therefore use the thaw command to tell Sherpa that the c1 parameter should also be fit.

The guess command tries to set the model parameters to something sensible: in most cases this is limited to a change in normalization, but you can also see that the range of the thawed parameters has also been changed here.

In [16]:
ui.thaw(mdl.c1)
ui.guess(mdl)
print(mdl)
polynom1d.mdl
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   mdl.c0       thawed      17.9139     -23.0992      58.9271           
   mdl.c1       thawed            0     -17.4896      17.4896           
   mdl.c2       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c3       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c4       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c5       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c6       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c7       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c8       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.offset   frozen            0 -3.40282e+38  3.40282e+38           

The plot_fit command displays the data and the current model (even if no fit has been made):

In [17]:
ui.plot_fit()
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with leastsq

In this case the fit is very quick:

In [18]:
ui.fit()
Dataset               = 1
Method                = levmar
Statistic             = leastsq
Initial fit statistic = 41778.4
Final fit statistic   = 36134.3 at function evaluation 6
Data points           = 370
Degrees of freedom    = 368
Change in statistic   = 5644.1
   mdl.c0         22.3585     
   mdl.c1         -0.00227968 

The plot_fit_resid command displays the data and best-fit in the top plot, and "data - best-fit" in the bottom plot.

In [19]:
ui.plot_fit_resid()
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with leastsq
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with leastsq

Note that the model parameters have been updated by the fit command:

In [20]:
print(mdl)
polynom1d.mdl
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   mdl.c0       thawed      22.3585     -23.0992      58.9271           
   mdl.c1       thawed  -0.00227968     -17.4896      17.4896           
   mdl.c2       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c3       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c4       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c5       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c6       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c7       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.c8       frozen            0 -3.40282e+38  3.40282e+38           
   mdl.offset   frozen            0 -3.40282e+38  3.40282e+38           

Since we are going to loop through each column, performing a fit, the screen output from the fit command will be overwhelming, and not very useful. So we hide it:

In [21]:
import logging
logger = logging.getLogger('sherpa')
print(logger.level)
print(logger.level == logging.INFO)
logger.level = logging.WARN
20
True

The following is a copy of the code from the original guide, but using Sherpa for the fit and model evaluation. Note that you can evaluate a model component - such as mdl - on an arbitrary grid (in this case xrows).

The out0 and out1 arrays are used to store the best-fit parameters for each column; a better approach would be to allocate a numpy array and fill it up rather than use a Python list, but it doesn't make much difference in this case!

In [22]:
xrows = np.arange(img_cr.shape[0])          # Array from 0 .. N_rows-1
bkg = np.zeros_like(img_cr)                 # Empty image for background fits
out0 = []                                   # Store the best-fit values
out1 = []
for col in np.arange(img_cr.shape[1]):      # Iterate over columns
    ui.load_arrays(1, x, img_cr[x, col])
    # note: in this case use the previous best-fit value as
    # a starting point for the next fit; really should be
    # resetting the values to be safe, although in this case
    # we may expect the fit values to be similar
    ui.fit()
    out0.append(mdl.c0.val)                 # Use .val to get at the numeric value
    out1.append(mdl.c1.val)
    bkg[:, col] = mdl(xrows)                # Evaluate the model for all rows

Before doing anything else, let's restore the default Sherpa logging level, in case we use anything later on in this notebook:

In [23]:
logger.level = logging.INFO

Out of interest, let's see how the model parameter values vary by column and against each other:

In [24]:
plt.plot(out0)
plt.xlabel('Column')
plt.ylabel('c0')
Out[24]:
<matplotlib.text.Text at 0x7f5683b52550>
In [25]:
plt.plot(out1)
plt.xlabel('Column')
plt.ylabel('c1')
Out[25]:
<matplotlib.text.Text at 0x7f5683ad3278>
In [26]:
plt.scatter(out0, out1)
plt.xlabel('c0')
plt.ylabel('c1')
Out[26]:
<matplotlib.text.Text at 0x7f5683a35978>

As expected, the normalization values are correlated, whereas it's not so obvious for the slope.

How does the background-subtracted version compare to the original guide? Well, the following plots show they are similar, but I have not done a close comparison to see how much difference there is.

In [27]:
plt.imshow(bkg, origin = 'lower', vmin=0, vmax=20)
plt.colorbar()
Out[27]:
<matplotlib.colorbar.Colorbar at 0x7f5683920b38>
In [28]:
img_bkg = img_cr - bkg
plt.clf()
plt.imshow(img_bkg, origin = 'lower', vmin=0, vmax=60)
plt.colorbar()
Out[28]:
<matplotlib.colorbar.Colorbar at 0x7f56945fe2e8>