This is the quickest way to load the contents of an audio file into a NumPy array:
import wavefile
fs, sig = wavefile.load('data/test_wav_pcm16.wav')
Let's check if this actually worked:
fs
44100
import matplotlib.pyplot as plt
plt.plot(sig);
Hmm, that doesn't look quite right ... let's transpose the signal:
plt.plot(sig.T);
Yes, that's it! The channels seem to be stored as rows of the array. Let's see how they are stored in memory:
sig.flags
C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True WRITEBACKIFCOPY : False UPDATEIFCOPY : False
OK, the array data is stored in Fortran order, which isn't typical for NumPy, but it's necessary if the channels are supposed to be along the rows (at least without re-ordering the data obtained from the underlying C library).
sig.dtype
dtype('float32')
By default, audio samples are stored with dtype='float32'
, which is common in audio applications.
It would be nice if we could actually choose which dtype
we want to use, but that seems only possible when using WaveReader
, which provides a few more advanced options.
from wavefile import WaveReader
r = WaveReader('data/test_wav_pcm16.wav')
r.channels, r.frames, r.samplerate
(7, 15, 44100)
r.format
65538
It's a bit hard to tell what that actually means ...
hex(r.format)
'0x10002'
The hardcore libsndfile users among you will know what that is:
from wavefile import Format
hex(Format.WAV), hex(Format.PCM_16), hex(Format.WAV | Format.PCM_16)
('0x10000', '0x2', '0x10002')
Ironically, reading from a WaveReader
isn't that straightforward because we first have to provide an appropriate array to read into.
But luckily there is a function buffer()
available which prepares an array for us:
data = r.buffer(1) # let's read one frame only
r.read(data)
data
array([[0.9999695 ], [0.8571167 ], [0.7142639 ], [0.57141113], [0.42855835], [0.28570557], [0.14285278]], dtype=float32)
Again, the default dtype
is 'float32'
, let's check if we can change that:
data = r.buffer(1, dtype='int16')
r.read(data)
data
array([[29522], [17511], [ 5208], [-4166], [-8756], [-8435], [-4681]], dtype=int16)
Now let's be nice and close the file.
Of course it would have been better if we would have used the WaveReader
in a with
statement (which is in fact the recommended usage).
r.close()
More exotic file type settings should also work, but let's check to be sure.
fs, sig = wavefile.load('data/test_wavex_pcm16.wav')
plt.plot(sig.T);
fs, sig = wavefile.load('data/test_wav_pcm24.wav')
plt.plot(sig.T);
fs, sig = wavefile.load('data/test_wavex_pcm24.wav')
plt.plot(sig.T);
fs, sig = wavefile.load('data/test_wav_float32.wav')
plt.plot(sig.T);
fs, sig = wavefile.load('data/test_wavex_float32.wav')
plt.plot(sig.T);
As expected, everything works!
# TODO!