from IPython.core.display import HTML
css_file = '../../styles/styles.css'
HTML(open(css_file, "r").read())
- Use SciPy for:
- Integration
- Interpolation
- Fourier analysis and period finding
- Special Functions
- Where to find help on scipy
So far we've seen how to read and write data to text files, how to plot data and how to do some basic analysis with NumPy. The third party library SciPy
provides a wide array of tools for the scientist.
Some of the topics that SciPy covers are:
As well as the links above, a good place to get started with SciPy topics is the tutorial, which has sections on each topic.
We will come back to many of these topics later in the course. Specifically there will be lessons that focus on fitting models to data and solving differential equations. In this practical, I want to briefly look at some of the functionality within SciPy to tackle some common tasks we might encounter as astronomers. The idea is to get an idea of the kinds of things that SciPy can do, rather than exhaustively explore a single area.
I hope that this lecture will not be two overwhelming - it's purpose is to impress on you that a small amount of Python code, powered by SciPy, can acheive a lot of complex tasks. It should also be useful as a reference to return to later in your course - try not to worry if there are one or two things you do not understand...
Suppose we want to evaluate
$$ \int_a^b f(x)\, dx. $$If $f(x)$ can be integrated analytically all is fine. What if there is no analytical solution? Or if we don't know the function $f(x)$ at all, but just have a series of measures of it's value? Numerical evaluation of an integral is called numerical quadrature, and scipy provides a series of functions for this - the quad
, dblquad
and tplquad
for single, double and triple integrals respectively.
from scipy.integrate import quad, dblquad, tplquad
Thes functions have a large number of optional arguments which alter the behaviour of the function (try quad?
for example).
Basic usage works like this:
# define a simple function for the integrand - in this case y = x
def f(x):
return x
x_lower = 2 # lower limit of integral
x_upper = 4 # upper limit
value, abserr = quad(f, x_lower, x_upper) # integrate numerically!
print ("The integral value is {}, with absolute error {}".format(value,abserr))
Thus, numerical integration of simple functions is pretty easy.
One optional argument is worth considering. The args
keyword argument allows us to handle integrand functions that take more than one argument. For example, the function below calculates $x$ raised to some, user-defined, power.
# integrand function with two arguments
def f(x, power):
return x**power
Let's say we want to use this to integrate $x^2$. We want the second argument to the function to be 2. We would integrate it using quad
like so:
x_lower = 3
x_upper = 6
#args is a list of the extra arguments to integrand function
value, abserr = quad(f, x_lower, x_upper, args=(2,) )
print ("The integral value is {}, with absolute error {}".format(value, abserr))
What if we don't know the underlying functional form, but just have a series of samples of the data? As an example, below we have some measurements of the radiation dose rate a spacecraft is receiving at different times. Suppose want to work out the total dose?
import numpy as np
# refer to session 2 if you don't understand the next 3 lines!
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('bmh')
# t = time of measurements, in hours
# G = measured radiation at spacecraft, in Grays/hour
data = np.loadtxt('../../data/Session3/radiation_dose.txt')
t = data[:,0] # first column
G = data[:,1] # second column
fig,ax = plt.subplots()
ax.plot(t, G, 'ro') # plot with red circles
ax.plot(t, G, 'k-') # and a black line
ax.grid(True)
ax.set_xlabel('Time (hours)')
ax.set_ylabel('Radiation dose rate (Grays/hour)')
plt.show()
The total radiation received is the area under this curve. In the past you've probably found this by hand, by using the trapezium rule. Actually, it would not be too hard to write a function to integrate this, but why bother, when SciPy has this coded up for you, along with the (usually) more accurate Simpsons rule.
Make a crude estimate by-eye of the area under the curve. Now, look at the online documentation for SciPy's version of Simpson's rule. Use Simpson's rule to estimate the area under the curve, and thus the total radiation dose...
from scipy.integrate import simps
total_dose = 0.0
# YOUR CODE HERE
print ('Total dose = {:.1f} Grays'.format(total_dose))
$$ E_n(x) = \int_1^\infty \frac{e^{-xt}}{t^n}\, dt $$
quad
can accept infinite limits to the integral, by passingnp.inf
as the limit.Calculate the integral
For $n=1$ and $x=0.5$
Hint: I have provided the integrand function already - you'll need to use the
args
keyword to make sure $n$ and $x$ have the correct values
def integrand(t, n, x):
return np.exp(-x*t) / t**n
# YOUR CODE HERE
Interpolation is very easy in scipy. The interp1d
function takes arrays of x and y data and returns a function that can be called for an arbitrary value of x, and returns the interpolated y value. Consider our spacecraft above:
from scipy.interpolate import interp1d
# get a function that does linear interpolation
# arguments are 'x' and 'y' for the curve we wish to
# interpolate
linear_interpolation = interp1d(t,G)
# call it to find dose at t=120
val = linear_interpolation(120.0)
print("Dose on spacecraft at 120 hours = {} Grays/hr".format(val))
# the value to calculate at can be an array, or list of values
# the function will return the interpolated value at each position
#in the list
print(linear_interpolation([120,130]))
The interp1d
function takes a kind
argument that allows you to set the type of interpolation:
# make a fine grid of 1000 values
finely_spaced_t_values = np.linspace(t.min(), t.max(), 1000)
# get a function that does linear interpolation
linear_interpolation = interp1d(t,G)
# call it
G_lin_interp = linear_interpolation(finely_spaced_t_values)
cubic_interpolation = interp1d(t,G,kind='cubic')
G_cub_interp = cubic_interpolation(finely_spaced_t_values)
# plot
fig,ax = plt.subplots(figsize=(10,5))
ax.plot(t, G, 'ro') # plot with red circles
ax.plot(finely_spaced_t_values, G_lin_interp, 'k--', label='linear') # dashed line
ax.plot(finely_spaced_t_values, G_cub_interp, 'k:', label='cubic') # dotted line
# format plot
ax.grid(True)
ax.axis('tight')
ax.legend()
ax.set_xlabel('Time (hours)')
ax.set_ylabel('Radiation dose rate (Grays/hour)')
plt.show()
from scipy import fftpack
In astronomy, one important use of Fourier transforms is to find periodic signals in data, such as the period of a pulsating or rotating star. This works because the Fourier transform of an infinitely long sine wave is a delta function at the period of the wave. Let's look at an example. First, we'll load and plot the data:
time, flux = np.loadtxt('../../data/Session3/lightcurve.txt',unpack=True)
fig,ax = plt.subplots()
ax.plot(time,flux,lw=1) # thinner linewidth
ax.set_xlabel('Time (hours)')
ax.set_ylabel('Flux')
plt.show()
We can see that it's a periodically variable star, and the period looks to be around 5 hours. Let's calculate a fourier transform to find the period:
# calculate the fast fourier transform
# gives an array of fourier transform values at each frequency
F = fftpack.rfft(flux)
# these values can be positive or negative, we only care about the size
F = np.abs(F)
#calculate the corresponding frequencies
N = len(time) # number of times
dt = time[1]-time[0] # time step between times
freq = fftpack.rfftfreq(N,dt)
#plot the fourier transform
fig,ax = plt.subplots()
ax.plot(freq,F)
ax.set_xlabel('Frequency (1/hour)')
ax.set_ylabel('Power')
ax.set_xlim(0,5)
plt.show()
As expected, the Fourier transform has a sharp peak around a frequency $f = 0.2$ hour$^{-1}$, which corresponds to a period $P = 1/f = 5$ hours. However, note the small second peak - this is telling us there is a second period present in the data at $\sim3$ hours, which we would have missed looking at the data by-eye!
The Fourier transform only works on regularly spaced data. Normally in astronomy we don't have regularly sampled data - for example if we were measuring the brightness of a star every hour for a month there would be lots of gaps in the data when the Sun was up!
The Lomb-Scargle periodogram is a classic method for finding periodicity in irregularly-sampled data. It is used a lot in astronomy. The Astropy library has an implementation of the Lomb-Scargle periodogram. You can find a full tutorial on it's use here
We have only skimmed the surface of scipy in this lecture. Later we'll see examples of using scipy to solve differential equations, and fit models to data. Before we finish, I want to mention the special function module within scipy. The reason for doing this will become apparent when we reach the homework!
Special functions are mathematical functions which have established names because they are important to mathematics or physics. Many special functions are the solutions to differential equations or integrals.
To demonstrate the special functions I will use the Bessel functions. Bessel functions are the functions $y(x)$ which solve Bessel's equation
$$ x^2 \frac{d^2y}{dx^2} + x \frac{dy}{dx} + (x^2 - n^2) y = 0. $$The solutions to this equation are called Bessel functions of the first kind. The solution for a given $n$ is denoted $J_n(x)$. No simple expressions for them exist, but they crop up frequently in astronomy - from studies of planetary dynamics, to calculating the rotation curves of galaxies. Therefore it is useful to have computer algorithms for calculating their value at a given value of $x$. Naturally, scipy contains these algorithms!
We use them in scipy as follows:
from scipy.special import jn, jn_zeros
#
# The scipy.special module includes a large number of Bessel-functions
# We only use jn, which are the Bessel functions of the first kind
# We also include the function jn_zeros that gives the locations of
# the zeroes of the function jn
#
value = jn(0, 1.5)
print ("J_0(1.5) = {:.2f}".format( value ))
Let's use SciPy to plot the Bessel functions from $n=0...3$:
x = np.linspace(0,10,100)
fig,ax = plt.subplots()
for n in range(4):
ax.plot(x, jn(n, x), label=r'$J_{}(x)$'.format(n))
ax.legend()
ax.set_xlabel('x')
ax.set_ylabel('y')
plt.show()
And we can use the jn_zeros
function to find the values of $x$ for which the Bessel functions are 0:
# zeros of Bessel functions
n = 1 # 1st order Bessel Function
m = 3 # number of zeros to compute
print( jn_zeros(n,m) )
One of the places Bessel functions crop up in astronomy is diffraction from a circular aperture. As you recall, we have seen that the finite aperture causes diffraction. In turn this means that stars cannot be point sources in astronomical images. Instead, in the absence of blurring from seeing, the star produces a diffraction pattern, known as an Airy disc.
It is possible to show that, for an aperture of diameter $D$, the diffraction pattern produced at an angle $\theta$ has the form
$$ I(\theta) = I_0 \left( \frac{2J_1(x)}{x} \right)^2,$$where
$$ x = \frac{\pi D}{\lambda} \sin \theta.$$In the equations above, $\lambda$ is the wavelength of light, $I_0$ is the intensity when $\theta=0$ and $J_1$ is the Bessel function of the Bessel function of the first kind and order 1.
Complete the code below to write a function that calculates the diffraction pattern from a telescope. Make sure your function passes the tests below.
Hint1: read the documentation of the function with care, and think hard about units.
Hint2: use
numpy
functions forsin
etc, so your code will accept arrays for thetheta
argument.
def diffraction(D, w, theta):
"""Calculate the intensity of the diffraction pattern from a telescope
Arguments
---------
D : float
the diameter of the telescope, in metres
w : float
the wavelength of light, in metres
theta : float or np.ndarray
the angle from the centre of the star, IN ARCSECONDS
Returns
-------
float or np.ndarray
the intensity of the diffraction pattern, NORMALISED so the result=1 at theta=0
"""
# YOUR CODE HERE
from nose.tools import assert_equals, assert_almost_equal
assert_almost_equal(diffraction(0.2,550e-9,0.5), 0.09191596586)
Plot the diffraction pattern for a 50 cm diameter telescope observing at a wavelength of 550 nm. You should calculate plot the diffraction pattern between -2 and 2 arcseconds.
Hint 1: remember that NumPy can perform operations on whole arrays at once, so if you pass an array into the function above as the $\theta$ argument, you will get an array back, i.e you shouldn't need to use a
for
loop here.
Hint 2: the function you have written will probably return NaN (not a number) for $\theta=0$. If you find this is a problem, change the number of steps in your array of
theta
values so you do not have a $\theta=0$ entry.
Hint 3: if you plot the square root of the intensity, rather than the intensity itself, the fringes are more visible
# YOUR CODE HERE
Below you have two cells, a code cell and a markdown cell. In the code cell, use the
$$\theta = 1.22 \frac{\lambda}{D}.$$jn_zeros
function to find the value of $x$ where the diffraction pattern has it's first zero. In the markdown cell, show that this means that the first zero lies at an angleIn the markdown cell, briefly explain why this leads to the Rayleigh criterion for resolving two objects.
Hint: you might want to look over the bootcamp section on writing equations in markdown
# YOUR CODE HERE
YOUR ANSWER HERE
Of course, we cannot neglect seeing! One way to think about the combined effects of seeing and diffraction is using convolution. Before it reaches the atmosphere, the star's profile is a delta function $\delta(x)$, as the star is infinitely far away.
The stellar profile is blurred, first by the atmosphere and then by diffraction. This blurring can be represented as a convolution of the intrinsic delta function with the Gaussian seeing profile $S$, and then by the diffraction pattern $D$. In other words, the observed profile is given by:
$$ O = (\delta * S) * D = S * D = D * S$$
(the last step is true since convolution is commutative).
In other words, if we convolve our diffraction pattern with a Gaussian of the correct width, we can reproduce the stellar profile we expect under a combination of diffraction and seeing.
Your challenge here is to convolve the diffraction pattern that you made above with Gaussians that represent seeing of 0.15 arcseconds and 0.6 arcseconds. Plot the resulting profiles together on one plot, with a legend.
You will need to know that the standard deviation and FWHM of a Gaussian are related by ${\rm FWHM} = 2.35 \sigma$. You will also need to know how to perform convolutions in Python. This is pretty easy using the
astropy
library. This is installed on Sage Math Cloud. The documentation here will help you perform the convolution with a Gaussian.
Hint: the width of the Gaussian used for convolution is array elements. Ask SL or a classmate if you can't work out what the correct value to use for the
stddev
argument in the docs above.
# YOUR CODE HERE