ThinkDSP¶

This notebook contains code examples from Chapter 6: Discrete Cosine Transform

License: Creative Commons Attribution 4.0 International

In [1]:

# Get thinkdsp.py

import os

if not os.path.exists('thinkdsp.py'):
    !wget https://github.com/AllenDowney/ThinkDSP/raw/master/code/thinkdsp.py

In [2]:

import numpy as np
PI2 = np.pi * 2

Synthesis¶

The simplest way to synthesize a mixture of sinusoids is to add up sinusoid signals and evaluate the sum.

In [3]:

from thinkdsp import CosSignal, SumSignal

def synthesize1(amps, fs, ts):
    components = [CosSignal(freq, amp)
                  for amp, freq in zip(amps, fs)]
    signal = SumSignal(*components)

    ys = signal.evaluate(ts)
    return ys

Here's an example that's a mixture of 4 components.

In [4]:

from thinkdsp import Wave

amps = np.array([0.6, 0.25, 0.1, 0.05])
fs = [100, 200, 300, 400]
framerate = 11025

ts = np.linspace(0, 1, framerate, endpoint=False)
ys = synthesize1(amps, fs, ts)
wave = Wave(ys, ts, framerate)
wave.apodize()
wave.make_audio()

Out[4]:

We can express the same process using matrix multiplication.

In [5]:

def synthesize2(amps, fs, ts):
    args = np.outer(ts, fs)
    M = np.cos(PI2 * args)
    ys = np.dot(M, amps)
    return ys

And it should sound the same.

In [6]:

ys = synthesize2(amps, fs, ts)
wave = Wave(ys, framerate)
wave.apodize()
wave.make_audio()

Out[6]:

And we can confirm that the differences are small.

In [7]:

ys1 = synthesize1(amps, fs, ts)
ys2 = synthesize2(amps, fs, ts)
np.max(np.abs(ys1 - ys2))

Out[7]:

1.2789769243681803e-13

Analysis¶

The simplest way to analyze a signal---that is, find the amplitude for each component---is to create the same matrix we used for synthesis and then solve the system of linear equations.

In [8]:

def analyze1(ys, fs, ts):
    args = np.outer(ts, fs)
    M = np.cos(PI2 * args)
    amps = np.linalg.solve(M, ys)
    return amps

Using the first 4 values from the wave array, we can recover the amplitudes.

In [9]:

n = len(fs)
amps2 = analyze1(ys[:n], fs, ts[:n])
amps2

Out[9]:

array([0.6 , 0.25, 0.1 , 0.05])

What we have so far is a simple version of a discrete cosine tranform (DCT), but it is not an efficient implementation because the matrix we get is not orthogonal.

In [10]:

# suppress scientific notation for small numbers
np.set_printoptions(precision=3, suppress=True)

In [12]:

def test1():
    N = 4.0
    time_unit = 0.001
    ts = np.arange(N) / N * time_unit
    max_freq = N / time_unit / 2
    fs = np.arange(N) / N * max_freq
    args = np.outer(ts, fs)
    M = np.cos(PI2 * args)
    return M

M = test1()
M

Out[12]:

array([[ 1.   ,  1.   ,  1.   ,  1.   ],
       [ 1.   ,  0.707,  0.   , -0.707],
       [ 1.   ,  0.   , -1.   , -0.   ],
       [ 1.   , -0.707, -0.   ,  0.707]])

To check whether a matrix is orthogonal, we can compute $M^T M$, which should be the identity matrix:

In [13]:

M.transpose().dot(M)

Out[13]:

array([[ 4.,  1., -0.,  1.],
       [ 1.,  2.,  1.,  0.],
       [-0.,  1.,  2.,  1.],
       [ 1.,  0.,  1.,  2.]])

But it's not, which means that this choice of M is not orthogonal.

Solving a linear system with a general matrix (that is, one that does not have nice properties like orthogonality) takes time proportional to $N^3$. With an orthogonal matrix, we can get that down to $N^2$. Here's how:

In [14]:

def test2():
    N = 4.0
    ts = (0.5 + np.arange(N)) / N
    fs = (0.5 + np.arange(N)) / 2
    args = np.outer(ts, fs)
    M = np.cos(PI2 * args)
    return M
    
M = test2()
M

Out[14]:

array([[ 0.981,  0.831,  0.556,  0.195],
       [ 0.831, -0.195, -0.981, -0.556],
       [ 0.556, -0.981,  0.195,  0.831],
       [ 0.195, -0.556,  0.831, -0.981]])

Now $M^T M$ is $2I$ (approximately), so M is orthogonal except for a factor of two.

In [15]:

M.transpose().dot(M)

Out[15]:

array([[ 2., -0.,  0.,  0.],
       [-0.,  2., -0., -0.],
       [ 0., -0.,  2., -0.],
       [ 0., -0., -0.,  2.]])

And that means we can solve the analysis problem using matrix multiplication.

In [16]:

def analyze2(ys, fs, ts):
    args = np.outer(ts, fs)
    M = np.cos(PI2 * args)
    amps = M.dot(ys) / 2
    return amps

It works:

In [17]:

n = len(fs)
amps2 = analyze1(ys[:n], fs, ts[:n])
amps2

Out[17]:

array([0.6 , 0.25, 0.1 , 0.05])

DCT¶

What we've implemented is DCT-IV, which is one of several versions of DCT using orthogonal matrices.

In [18]:

def dct_iv(ys):
    N = len(ys)
    ts = (0.5 + np.arange(N)) / N
    fs = (0.5 + np.arange(N)) / 2
    args = np.outer(ts, fs)
    M = np.cos(PI2 * args)
    amps = np.dot(M, ys) / 2
    return amps

We can check that it works:

In [19]:

amps = np.array([0.6, 0.25, 0.1, 0.05])
N = 4.0
ts = (0.5 + np.arange(N)) / N
fs = (0.5 + np.arange(N)) / 2
ys = synthesize2(amps, fs, ts)

amps2 = dct_iv(ys)
np.max(np.abs(amps - amps2))

Out[19]:

8.326672684688674e-17

DCT and inverse DCT are the same thing except for a factor of 2.

In [20]:

def inverse_dct_iv(amps):
    return dct_iv(amps) * 2

And it works:

In [21]:

amps = [0.6, 0.25, 0.1, 0.05]
ys = inverse_dct_iv(amps)
amps2 = dct_iv(ys)
np.max(np.abs(amps - amps2))

Out[21]:

8.326672684688674e-17

Dct¶

thinkdsp provides a Dct class that encapsulates the DCT in the same way the Spectrum class encapsulates the FFT.

In [22]:

from thinkdsp import TriangleSignal

signal = TriangleSignal(freq=400)
wave = signal.make_wave(duration=1.0, framerate=10000)
wave.make_audio()

Out[22]:

To make a Dct object, you can invoke make_dct on a Wave.

In [23]:

from thinkdsp import decorate

dct = wave.make_dct()
dct.plot()
decorate(xlabel='Frequency (Hz)', ylabel='DCT')

Dct provides make_wave, which performs the inverse DCT.

In [24]:

wave2 = dct.make_wave()

The result is very close to the wave we started with.

In [25]:

np.max(np.abs(wave.ys-wave2.ys))

Out[25]:

7.771561172376096e-16

Negating the signal changes the sign of the DCT.

In [26]:

signal = TriangleSignal(freq=400, offset=0)
wave = signal.make_wave(duration=1.0, framerate=10000)
wave.ys *= -1
wave.make_dct().plot()

Adding phase offset $\phi=\pi$ has the same effect.

In [27]:

signal = TriangleSignal(freq=400, offset=np.pi)
wave = signal.make_wave(duration=1.0, framerate=10000)
wave.make_dct().plot()

In [ ]: