import numpy as np
import soundfile as sf
import sounddevice as sd
from scipy.signal import fftconvolve
import tools
This exercise examines the degree of coherence $k$ between right and left ear using noise signals. You can find more information in the (non-public) lecture notes on slides 4-29 to 4-31.
fs = 44100
dur = 1 # seconds
ks = np.linspace(0, 1, 11)
stddev = 0.2
length = int(dur * fs)
np.random.seed(7)
for k in ks:
R = [[1, k],
[k, 1]]
L, V = np.linalg.eig(R)
L.shape = 1, -1 # turn L into row vector
W = V * np.sqrt(L)
# create a new random signal in each iteration:
stereo_noise = np.random.normal(scale=stddev, size=(length, 2))
outsig = np.dot(stereo_noise, W.T) # matrix product
sf.write('outdata/noise_k{:.1f}.wav'.format(k), outsig, fs)
This creates several WAV files containing noise. The degree of coherence $k$ between both channels is part of the respective filename.
Exercise: Listen to the different files and note how you perceive different degrees of coherence.
$k=1.0$ outdata/noise_k1.0.wav
$k=0.9$ outdata/noise_k0.9.wav
$k=0.8$ outdata/noise_k0.8.wav
$k=0.7$ outdata/noise_k0.7.wav
$k=0.6$ outdata/noise_k0.6.wav
$k=0.5$ outdata/noise_k0.5.wav
$k=0.4$ outdata/noise_k0.4.wav
$k=0.3$ outdata/noise_k0.3.wav
$k=0.2$ outdata/noise_k0.2.wav
$k=0.1$ outdata/noise_k0.1.wav
$k=0.0$ outdata/noise_k0.0.wav
Create audio examples based on slide 4-33.
To do that, convolve the given HRIRs (stored in the files data/hrir00.wav
,
data/hrir45.wav
and data/hrir90.wav
) on the one hand with an arbitrary
speech signal (e.g. from the data/
directory), on the other hand with a noise signal
(use the function numpy.random.normal()).
Add speech and noise in different combinations of angles of incidence. Amplify or attenuate speech and noise in a way that the speech is just barely understandable.
Listen to the different combinations. In which combination can you understand the speech better or worse?
Two important cues for spatial hearing are ITD and ILD.
Exercise: Click and learn:
Watch (and listen to!) the videos on this page: http://auditoryneuroscience.com/spatial-hearing/binaural-cues
Here is an interactive example: http://auditoryneuroscience.com/spatial-hearing/time-intensity-trading
Exercise (only if you happen to have access to Matlab®): Do the listening test described on this page: http://auditoryneuroscience.com/spatial-hearing/ITD-ILD-practical
Exercise: Try to calculate the ITD as function of the angle of incidence using a very much simplified head model. Assume a spherical head with ear holes on exactly opposite points on the sphere. Make the head radius an adjustable parameter in your calculations; use 8.75 cm as default value. Assume further, that the sound source is sufficiently far away, so that the angle of incidence is constant on the whole sphere.
What is the maximum distance in the path from the sound source to the two ear holes, respectively?
What time difference does that correspond to, assuming the speed of sound $c = 343 \text{ m/s}$?
Plot the ITD as a function of angle of incidence for a given head radius.
Simulate the ear signals of the listening experiments on slide 4-26 and 4.27.
Exercise: Create a function that takes a mono signal and returns two signals. The left signal is same as the input but the right signal is delayed and scaled.
def delay_and_attenuate(x, delay, gain, fs=44100):
"""
Parameters
----------
x : array_like
Mono signal
delay : float
Delay of the second channel (in milliseconds)
gain : float
Gain of the second channel (in dB)
"""
d = int(np.round(delay * fs * 0.001))
x = np.tile(np.concatenate((x, np.zeros(np.abs(d))), axis=-1), (2, 1)).T
x[:, 1] = 10**(gain / 20) * np.roll(x[:, 1], d, axis=0)
return tools.normalize(x, maximum=0.6)
Exercise: Use two HRIR pairs to create the ear signals
def virtual_stereo(x, **kwargs):
"""
"""
# load HRIRs
hright, _ = sf.read('data/hrir30.wav')
hleft = np.roll(hright, 1)
return np.column_stack([fftconvolve(x[:, 0], ir, **kwargs) for ir in hleft.T]) \
+ np.column_stack([fftconvolve(x[:, 1], ir, **kwargs) for ir in hright.T])
Create the ear signals of the 'Virtual Stereo System'. Try it with different types of signals in 'data/' ('castanets.wav', 'xmax.wav', 'singing.wav').
delay = 0
gain = 0
# mono_signal, _ = sf.read('...')
stereo_signal = delay_and_attenuate(mono_signal, delay, gain)
y = tools.normalize(virtual_stereo(stereo_signal), maximum=0.6)
sd.play(y)
sd.wait()
Vary the relative gain between the left and right channels. Try to localize the sound source. Do you hear only one source? Compare your observations with the results on Slide 4-26.
Apply a delay to the second (right) signal, in the range of 0 -- 50 ms. Compare your observation with the results shown in Slide 4-27.
If you had problems solving some of the exercises, don't despair! Have a look at the example solutions.
To the extent possible under law,
the person who associated CC0
with this work has waived all copyright and related or neighboring
rights to this work.