Audio signal analysis with python


Phd. Jose Ricardo Zapata

<img style="float: right;" src="./Img/logopycon.png" width="320" height="50" alt = "PyCON 2018 - The Art of Coding">
PyCON 2018 - The Art of Coding
Medellin - Colombia

Note: This is a Jupyter notebook slideshow for a better visualization of the presentation I highly recommend to install RISE (Reveal.js - Jupyter/IPython Slideshow Extension)- turns the Jupyter Notebooks into a live presentation, then use a full screen view in the web browser (press F11)

https://github.com/damianavila/RISE

About me (https://joserzapata.github.io/)

Professor and researcher at Universidad Pontificia Bolivariana (UPB)

  • Phd. Sound and music computing, Universitat Pompeu Fabra, Barcelona - EspaƱa
  • MEng. Telecommunications and BSc in Electronic Engineering, UPB, Medellin - Colombia

About me (https://joserzapata.github.io/)

Interests

  • Music Information Retrieval
  • Audio signal processing
  • Data science

What is Audio Analysis

Audio analysis refers to the extraction of information, description and meaning from audio signals for:

  • Signal Analysis -> Ex: Voice diseases detection
  • Musical Analysis -> Ex: Extract Rhythm, Melody and Harmony
  • Classification -> Ex: Genre classification, mood recognition

What is Audio Analysis

  • Storage -> Ex: Music compression (mp3, mp4)
  • Retrieval -> Ex: Query by humming
  • Music Recommendation
  • Synthesis, etc.

Outline

  1. Python libraries for audio
  2. Reading, writing and visualizing Audio signals in python
  3. Frequency representation of audio signals
  4. Applications -> Music Information Retrieval

Audio signals in python

  • Reading and Playing audio files
  • Visualizing audio
  • Writing audio files

Reading Audio Files

In [1]:
# Example: Reading audio file with scipy.io
AudioFileName = "Data/Whistle.wav"
from scipy.io import wavfile

# Output fs: Frequency sample and data: Audio signal -> int16
fs, Audiodata = wavfile.read(AudioFileName)
print('AudioFile = {}, Sample Rate = {} [=] Samples/Sec, Wav format = {}'.format(AudioFileName,fs,Audiodata.dtype))
# For play audio in Jupyter Notebook
import IPython.display as ipd #Ipython functions for jupyter
ipd.Audio(AudioFileName) # play audio directly in a Jupyter notebook.
AudioFile = Data/Whistle.wav, Sample Rate = 44100 [=] Samples/Sec, Wav format = int16
Out[1]:

Visualizing Audio Signal

In [3]:
import matplotlib.pyplot as plt #Ploting library
from __future__ import print_function, division
%matplotlib inline
plt.style.use('ggplot') #plot style
plt.rcParams['figure.figsize'] = (15, 5) # set plot size
plt.plot(Audiodata)
plt.title('Audio Waveform with no proper axis values',size=16);
In [4]:
# Set proper values for plot axis
import numpy as np
# set data amplitude values between [-1 : 1] Audiodata.dtype is int16
AudiodataScaled = Audiodata / (2.**15)
#Set x axis values to milliseconds
timeValues = np.arange(0, len(AudiodataScaled), 1)/ fs # converting samples/Sec to Seconds
timeValues = timeValues * 1000  #scale to milliseconds
plt.plot(timeValues, AudiodataScaled);plt.title('Audio Waveform',size=16)
plt.ylabel('Amplitude'); plt.xlabel('Time (ms)');

Writing Audio Files

In [5]:
# Making some modifications to audio file
LessGaindata = AudiodataScaled/2.0 # Dividing by two the amplitude to the signal
# Converting to int16 to save the audio file with 16 Bits @ 441000 Hz freq sampling
LessGaindata = LessGaindata*(2.**15)
LessGaindata = LessGaindata.astype(np.int16)
print(' Directory files before writing = ');
!ls ./Data # Jupyter Magic :)
# Writing de audio signal to a file
wavfile.write('Data/LessGain.wav',fs,LessGaindata)
 Directory files before writing = 
Beastie.wav   History_background.wav  Journey_beats.wav  percussive.wav
disco.wav     History_foreground.wav  Journey.wav	 SMC_9.wav
harmonic.wav  History.wav	      LessGain.wav	 Whistle.wav
In [ ]:
## Note: You can write the audio file as float point in this way

# Making some modifications to audio file
LessGaindata = AudiodataScaled/2.0 # Dividing by two the amplitude to the signal
# Converting to int16 to save the audio file with 16 Bits @ 441000 Hz freq sampling
#LessGaindata = LessGaindata*(2.**15)  #Commenting to save in float point
#LessGaindata = LessGaindata.astype(np.int16)# Comenting to save in float point
print(' Directory files before writing = ');
!ls ./Data # Jupyter Magic :)
# Writing de audio signal to a file
wavfile.write('Data/LessGain.wav',fs,LessGaindata)

# The advantage is ->  the audio file data amplitude values will be between [-1 : 1]
# so, if read the audio file the values will be between [-1:1] directly.
In [6]:
#Reading the new audiofile
fs, LessGain = wavfile.read('Data/LessGain.wav')
# set data amplitude values between [-1 : 1] LessGain.dtype is int16
LessGainScaled = LessGain / (2.**15)
plt.plot(timeValues,LessGainScaled);plt.title('Audio Waveform Transformed',size=16)
plt.ylabel('Amplitude'); plt.xlabel('Time (ms)');

Frequency representation

  • Fast Fourier Transform (FFT)
  • Magnitude Spectrum plot(2D)
  • Spectrogram (3D)
In [7]:
fs, Audiodata = wavfile.read(AudioFileName) # Reading AudioFile
Audiodata = Audiodata / (2.**15) # set data amplitude values between [-1 : 1]

from scipy.fftpack import fft
n = len(Audiodata) 
AudioFreq = fft(Audiodata) # Computing the fourier transform
AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))]
# FFT output is a array of complex numbers
MagFreq = np.abs(AudioFreq) # absolute value to obtain the magnitude

# scaling by the number of points to avoid that the magnitude values 
# depends of signal length the signal or its sampling frequency 
MagFreq = MagFreq / float(n)
# Compute the power of the magnitude
MagFreq = MagFreq**2  

Magnitude Spectrum Plot (2D)

In [8]:
from plotly import offline as py #library for interactive plots
import plotly.tools as tls
py.init_notebook_mode()