This jupyter notebook is part of a collection of notebooks on various topics of Digital Signal Processing. Please direct questions and suggestions to Sascha.Spors@uni-rostock.de.
Speech signals have a non-uniform amplitude distribution which is often modeled by the Laplace distribution. Linear uniform quantization is not optimal for speech signals, since small signal amplitudes are more likely than higher ones. This motivates a non-linear quantization scheme, where the signal is companded before linear quantization and expanded afterwards.
The following example illustrates the A-law quantization scheme used in European telephone networks. In this scheme the signal is first companded by the non-linear A-law characteristic before a linear uniform quantizer is used. This results overall in a non-linear quantization characteristic. First some functions for A-law companding/expanding, quantization and evaluation are defined.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import soundfile as sf
def A_law_compander(x):
A = 87.6
y = np.zeros_like(x)
idx = np.where(np.abs(x) < 1/A)
y[idx] = A*np.abs(x[idx]) / (1 + np.log(A))
idx = np.where(np.abs(x) >= 1/A)
y[idx] = (1 + np.log(A*np.abs(x[idx]))) / (1 + np.log(A))
return np.sign(x)*y
def A_law_expander(y):
A = 87.6
x = np.zeros_like(y)
idx = np.where(np.abs(y) < 1/(1+np.log(A)))
x[idx] = np.abs(y[idx])*(1+np.log(A)) / A
idx = np.where(np.abs(y) >= 1/(1+np.log(A)))
x[idx] = np.exp(np.abs(y[idx])*(1+np.log(A))-1)/A
return np.sign(y)*x
def uniform_midtread_quantizer(x, w):
# quantization step
Q = 1/(2**(w-1))
# limiter
x = np.copy(x)
idx = np.where(x <= -1)
x[idx] = -1
idx = np.where(x > 1 - Q)
x[idx] = 1 - Q
# linear uniform quantization
xQ = Q * np.floor(x/Q + 1/2)
return xQ
def evaluate_requantization(x, xQ):
e = xQ - x
# SNR
SNR = 10*np.log10(np.var(x)/np.var(e))
print('SNR: {:2.1f} dB'.format(SNR))
# normalize error
e = .2 * e / np.max(np.abs(e))
return e
Lets first take a look at the non-linear characteristic of the A-law requantizer. The left plot shows the characteristic of the A-law companding and linear-quantization. The right plot shows the overall characteristic for companding, linear quantization and expansion.
x = np.linspace(-1, 1, 2**16)
y = A_law_compander(x)
yQ4 = uniform_midtread_quantizer(y, 4)
yQ8 = uniform_midtread_quantizer(y, 8)
xQ4 = A_law_expander(yQ4)
xQ8 = A_law_expander(yQ8)
plt.figure(figsize=(10, 4))
plt.subplot(121)
plt.plot(x, yQ4, label=r'$w=4$ bit')
plt.plot(x, yQ8, label=r'$w=8$ bit')
plt.title('Compansion and linear quantization')
plt.xlabel(r'$x$')
plt.ylabel(r'$x_Q$')
plt.legend(loc=2)
plt.axis([-1.1, 1.1, -1.1, 1.1])
plt.grid()
plt.subplot(122)
plt.plot(x, xQ4, label=r'$w=4$ bit')
plt.plot(x, xQ8, label=r'$w=8$ bit')
plt.title('Overall')
plt.xlabel(r'$x$')
plt.ylabel(r'$x_Q$')
plt.legend(loc=2)
plt.axis([-1.1, 1.1, -1.1, 1.1])
plt.grid()
Now the signal-to-noise ratio (SNR) is computed for a Laplace distributed signal for various RMS levels $\sigma_x / x_\mathrm{min}$. The results show that the non-linear quantization scheme provides a constant SNR over a wide range of levels. The SNR is additional higher as for linear quantization of a Laplace distributed signal.
w = 8 # wordlength of the quantizer
A = np.logspace(-50/20, -10/20, num=500) # relative RMS levels
N = int(1e6) # number of samples
np.random.seed(1)
def compute_SNR(a):
# compute input signal
x = np.random.laplace(size=N, scale=a/np.sqrt(2))
# quantize signal
y = A_law_compander(x)
yQ = uniform_midtread_quantizer(y, 8)
xQ = A_law_expander(yQ)
e = xQ - x
# compute SNR
SNR = 10*np.log10((np.var(x)/np.var(e)))
return SNR
# quantization step
Q = 1/(2**(w-1))
# compute SNR for given RMS levels
SNR = [compute_SNR(a) for a in A]
# plot results
plt.figure(figsize=(8,4))
plt.plot(20*np.log10(A), SNR)
plt.xlabel(r'RMS level $\sigma_x / x_\mathrm{min}$ in dB')
plt.ylabel('SNR in dB')
plt.grid()
Finally we requantize a speech sample with a linear and the A-law quantization scheme. The speech signal was originally recorded with a wordlength of $w=16$ bits using linear uniform quantization. First the A-law compansion is applied, then quantization by a linear uniform quantizer with a wordlength of $w=8$ bits. This scheme is used in the backbone of many telephone networks resulting in a total bit-rate of 64 kbits/s. Listen to the samples! Note, the quantization error has been normalized.
# load speech sample
x, fs = sf.read('../data/speech_8k.wav')
x = x/np.max(np.abs(x))
# linear quantization
xQ = uniform_midtread_quantizer(x, 8)
e = evaluate_requantization(x, xQ)
sf.write('speech_8k_8bit.wav', xQ, fs)
sf.write('speech_8k_8bit_error.wav', e, fs)
# A-law quantization
y = A_law_compander(x)
yQ = uniform_midtread_quantizer(y, 8)
xQ = A_law_expander(yQ)
e = evaluate_requantization(x, xQ)
sf.write('speech_Alaw_8k_8bit.wav', xQ, fs)
sf.write('speech_Alaw_8k_8bit_error.wav', e, fs)
SNR: 35.7 dB SNR: 38.2 dB
Original Signal
A-law Requantization to $w=8$ bit
Signal
Error
Copyright
This notebook is provided as Open Educational Resource. Feel free to use the notebook for your own purposes. The text is licensed under Creative Commons Attribution 4.0, the code of the IPython examples under the MIT license. Please attribute the work as follows: Sascha Spors, Digital Signal Processing - Lecture notes featuring computational examples, 2016-2018.