# Source Separation with Sparsity¶


This numerical tour explore local Fourier analysis of sounds, and its application to source separation from stereo measurements.

In [2]:
using PyPlot
using NtToolBox
using WAV


## Sound Mixing¶

We load 3 sounds and simulate a stero recording by performing a linear blending of the sounds.

In [3]:
n = 1024*16
s = 3 #number of sounds
p = 2 #number of micros

x = zeros(n,3)


Normalize the energy of the signals.

In [4]:
x = x./repeat(std(x,1), outer=(n,1));


We mix the sound using a $2\mathrm{x}3$ transformation matrix. Here the direction are well-spaced, but you can try with more complicated mixing matrices.

Compute the mixing matrix

In [5]:
theta = Array(linspace(0, pi, s + 1)); theta = theta[1:3]
theta[1] = 0.2
M = vcat(cos(theta)', sin(theta)');


Compute the mixed sources.

In [6]:
y = x*M';


Display of the sounds and their mix.

In [7]:
figure(figsize = (10,10))

for i in 1:s
subplot(s, 1, i)
plot(x[:, i])
xlim(0,n)
title("Source #$i") end  Display of the micro output. In [8]: figure(figsize = (10,7)) for i in 1:p subplot(p, 1, i) plot(y[:, i]) xlim(0,n) title("Micro #$i")
end


## Local Fourier analysis of sound.¶

In order to perform the separation, one performs a local Fourier analysis of the sound. The hope is that the sources will be well-separated over the Fourier domain because the sources are sparse after a STFT.

First set up parameters for the STFT.

In [9]:
w = 128   #size of the window
q = Base.div(w,4);  #overlap of the window


Compute the STFT of the sources.

In [10]:
X = complex(zeros(w,4*w+1,s))
Y = complex(zeros(w,4*w+1,p))

for i in 1:s
X[:,:,i] = perform_stft(x[:,i],w,q,n)
figure(figsize = (15,10))
plot_spectrogram(X[:,:,i],"Source #\$i")
end


Exercise 1

Compute the STFT of the micros, and store them into a matrix |Y|.

In [11]:
#run -i nt_solutions/audio_2_separation/exo1
include("NtSolutions/audio_2_separation/exo1.jl")

In [12]:
## Insert your code here.


## Estimation of Mixing Direction by Clustering¶

Since the sources are quite sparse over the Fourier plane, the directions are well estimated by looking as the direction emerging from a point clouds of the transformed coefficients.

First we compute the position of the point cloud.

In [13]:
mf = size(Y)[1]
mt = size(Y)[2]
P = reshape(Y, (mt*mf,p))
P = vcat(real(P), imag(P));


Then we keep only the 5% points with largest energy.

Display some points in the original (spacial) domain.

Number of displayed points.

In [14]:
npts = 6000;


Display the original points.

In [15]:
sel = randperm(n)

sel = sel[1:npts]

figure(figsize = (7,5))
plot(y[sel,1], y[sel,2], ".", ms = 3)
xlim(-5,5)
ylim(-5,5)
title("Time domain");