This jupyter notebook is part of a collection of notebooks on various topics of Digital Signal Processing. Please direct questions and suggestions to Sascha.Spors@uni-rostock.de.
The (auto-) power spectral density (PSD) is defined as the Fourier transformation of the auto-correlation function (ACF).
For a continuous-amplitude real-valued wide-sense stationary (WSS) random signal $x[k]$ the PSD is given as
\begin{equation} \Phi_{xx}(\mathrm{e}^{\,\mathrm{j}\,\Omega}) = \mathcal{F}_* \{ \varphi_{xx}[\kappa] \} \end{equation}where $\mathcal{F}_* \{ \cdot \}$ denotes the discrete-time Fourier transformation (DTFT) and $\varphi_{xx}[\kappa]$ the ACF of $x[k]$. Note, the DTFT is performed with respect to $\kappa$. The ACF of a random signal of finite length $N$ can be expressed by way of a linear convolution
\begin{equation} \varphi_{xx}[\kappa] = \frac{1}{N} \cdot x_N[k] * x_N[-k] \end{equation}Taking the DTFT of the left- and right-hand side results in
\begin{equation} \Phi_{xx}(\mathrm{e}^{\,\mathrm{j}\,\Omega}) = \frac{1}{N} \, X_N(\mathrm{e}^{\,\mathrm{j}\,\Omega})\, X_N(\mathrm{e}^{-\,\mathrm{j}\,\Omega}) = \frac{1}{N} \, | X_N(\mathrm{e}^{\,\mathrm{j}\,\Omega}) |^2 \end{equation}The last equality results from the definition of the magnitude and the symmetry of the DTFT for real-valued signals. The spectrum $X_N(\mathrm{e}^{\,\mathrm{j}\,\Omega})$ quantifies the amplitude density of the signal $x_N[k]$. It can be concluded from above result that the PSD quantifies the squared amplitude or power density of a random signal. This explains the term power spectral density.
The properties of the PSD can be deduced from the properties of the ACF and the DTFT as
From the link between the PSD $\Phi_{xx}(\mathrm{e}^{\,\mathrm{j}\,\Omega})$ and the spectrum $X_N(\mathrm{e}^{\,\mathrm{j}\,\Omega})$ derived above it can be concluded that the PSD real valued
$$\Phi_{xx}(\mathrm{e}^{\,\mathrm{j}\,\Omega}) \in \mathbb{R}$$
From the even symmetry $\varphi_{xx}[\kappa] = \varphi_{xx}[-\kappa]$ of the ACF it follows that
$$ \Phi_{xx}(\mathrm{e}^{\,\mathrm{j} \, \Omega}) = \Phi_{xx}(\mathrm{e}^{\,-\mathrm{j}\, \Omega}) $$
The PSD of an uncorrelated random signal is given as
$$ \Phi_{xx}(\mathrm{e}^{\,\mathrm{j} \, \Omega}) = \sigma_x^2 + \mu_x^2 \cdot {\bot \!\! \bot \!\! \bot}\left( \frac{\Omega}{2 \pi} \right) $$
which can be deduced from the ACF of an uncorrelated signal.
The quadratic mean of a random signal is given as
$$ E\{ x[k]^2 \} = \varphi_{xx}[0] = \frac{1}{2\pi} \int\limits_{-\pi}^{\pi} \Phi_{xx}(\mathrm{e}^{\,\mathrm{j}\, \Omega}) \,\mathrm{d} \Omega $$
The last relation can be found by expressing the ACF by the inverse DTFT.
In this example the PSD $\Phi_{xx}(\mathrm{e}^{\,\mathrm{j} \,\Omega})$ of a speech signal $x[k]$ is estimated by applying a discrete Fourier transformation (DFT) to its ACF. For a better interpretation of the PSD, the frequency axis $f = \frac{\Omega}{2 \pi} \cdot f_s$ has been chosen for illustration, where $f_s$ denotes the sampling frequency of the signal.
In Python the ACF is stored in a vector with indexes $0, 1, ..., 2N -1$ where the indexes correspond to the lags $\kappa = -N+1,-N+2,....,N-1$. When computing the discrete Fourier transform (DFT) of the ACF numerically by the fast Fourier transform (FFT) one has to take this shift into account. For instance, by multiplying the DFT $\Phi_{xx}[\mu]$ by $e^{j \mu \frac{2 \pi}{2N - 1} (N-1)}$ where $N$ denotes the length of the signal $N$.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
# read audio file
fs, x = wavfile.read('../data/speech_8k.wav')
x = np.asarray(x, dtype=float)
N = len(x)
# compute ACF
acf = 1/len(x) * np.correlate(x, x, mode='full')
# compute PSD
psd = np.fft.fft(acf)
psd = psd * np.exp(1j*np.arange(2*N-1)*2*np.pi*(N-1)/(2*N-1))
f = np.fft.fftfreq(2*N-1, d=1/fs)
# plot PSD
plt.figure(figsize = (10, 8))
plt.plot(f, np.real(psd))
plt.title('Estimated power spectral density')
plt.ylabel(r'$\hat{\Phi}_{xx}(e^{j \Omega})$')
plt.xlabel(r'$f$')
plt.axis([0, 2000, 0, 1.1*max(np.abs(psd))]);
plt.grid()
Exercise
Solution: It can be concluded from the shown PSD that the main power of a speech signal is contained in the frequency range below 500 Hz. The speech signal exhibits furthermore a harmonic structure with a dominant fundamental frequency and a number of harmonics.
The cross-power spectral density is defined as the Fourier transformation of the cross-correlation function (CCF).
For two continuous-amplitude real-valued wide-sense stationary (WSS) random signals $x[k]$ and $y[k]$ the cross-power spectral density is given as
\begin{equation} \Phi_{xy}(\mathrm{e}^{\,\mathrm{j} \, \Omega}) = \mathcal{F}_* \{ \varphi_{xy}[\kappa] \} \end{equation}where $\varphi_{xy}[\kappa]$ denotes the CCF of $x[k]$ and $y[k]$. Note, the DTFT is performed with respect to $\kappa$. The CCF of two random signals of finite lengths $N$ and $M$ can be expressed by way of a linear convolution
\begin{equation} \varphi_{xy}[\kappa] = \frac{1}{N} \cdot x_N[k] * y_M[-k] \end{equation}Taking the DTFT of the left- and right-hand side results in
\begin{equation} \Phi_{xy}(\mathrm{e}^{\,\mathrm{j}\,\Omega}) = \frac{1}{N} \, X_N(\mathrm{e}^{\,\mathrm{j}\,\Omega})\, Y_M(\mathrm{e}^{-\,\mathrm{j}\,\Omega}) \end{equation}The symmetries of $\Phi_{xy}(\mathrm{e}^{\,\mathrm{j}\, \Omega})$ can be derived from the symmetries of the CCF and the DTFT as
$$ \underbrace {\Phi_{xy}(\mathrm{e}^{,\mathrm{j}, \Omega}) = \Phi_{xy}^*(\mathrm{e}^{-,\mathrm{j}, \Omega})}{\varphi{xy}[\kappa] \in \mathbb{R}} =
\underbrace {\Phi_{yx}(\mathrm{e}^{,- \mathrm{j}, \Omega}) = \Phi_{yx}^*(\mathrm{e}^{,\mathrm{j}, \Omega})}{\varphi{yx}[-\kappa] \in \mathbb{R}}$$
from which can be concluded that $|\Phi_{xy}(\mathrm{e}^{\,\mathrm{j}\, \Omega})| = |\Phi_{yx}(\mathrm{e}^{\,\mathrm{j}\, \Omega})|$
The cross PSD of two uncorrelated random signals is given as
$$ \Phi_{xy}(\mathrm{e}^{\,\mathrm{j} \, \Omega}) = \mu_x^2 \mu_y^2 \cdot {\bot \!\! \bot \!\! \bot}\left( \frac{\Omega}{2 \pi} \right) $$
which can be deduced from the CCF of an uncorrelated signal.
The following example estimates and plots the cross PSD $\Phi_{xy}(\mathrm{e}^{\,\mathrm{j}\, \Omega})$ of two random signals $x_N[k]$ and $y_M[k]$ of finite lengths $N$ and $M = 2 N$. The estimated cross PSD is calculated from the valid part of the CCF $\varphi_{xy}[\kappa]$ by an DFT in order to exclude boundary effects.
N = 1024 # length of random signal x
# generate two random signals
np.random.seed(2)
x = 2 + np.random.normal(size=N)
y = 1 + np.random.normal(size=2*N)
# compute cross PSD via CCF
acf = 1/N * np.correlate(x, y, mode='valid')
psd = np.fft.fft(acf)
psd = psd * np.exp(1j*np.arange(N+1)*2*np.pi*(N-1)/(2*N-1))
# plot results
f = np.fft.fftfreq(len(psd), d=1/2)
plt.figure(figsize = (10, 4))
plt.stem(f, np.real(psd))
plt.title('Estimated cross power spectral density')
plt.ylabel(r'$\hat{\Phi}_{xy}(e^{j \Omega})$')
plt.xlabel(r'$\Omega/ \pi$')
plt.grid()
Exercise
Solution: The cross PSD $\Phi_{xy}(\mathrm{e}^{\,\mathrm{j} \, \Omega})$ is essential only non-zero for $\Omega=0$. It hence can be concluded that the two random signals are not mean-free and uncorrelated to each other.
Copyright
This notebook is provided as Open Educational Resource. Feel free to use the notebook for your own purposes. The text is licensed under Creative Commons Attribution 4.0, the code of the IPython examples under the MIT license. Please attribute the work as follows: Sascha Spors, Digital Signal Processing - Lecture notes featuring computational examples, 2016-2018.