FMA: A Dataset For Music Analysis

Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2.

Usage

  1. Go through the paper to understand what the data is about.
  2. Download some datasets from https://github.com/mdeff/fma.
  3. Uncompress the archives, e.g. with unzip fma_small.zip.
  4. Load and play with the data in this notebook.
In [1]:
%matplotlib inline

import os

import IPython.display as ipd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn as skl
import sklearn.utils, sklearn.preprocessing, sklearn.decomposition, sklearn.svm
import librosa
import librosa.display

import utils

plt.rcParams['figure.figsize'] = (17, 5)
In [2]:
# Directory where mp3 are stored.
AUDIO_DIR = os.environ.get('AUDIO_DIR')

# Load metadata and features.
tracks = utils.load('data/fma_metadata/tracks.csv')
genres = utils.load('data/fma_metadata/genres.csv')
features = utils.load('data/fma_metadata/features.csv')
echonest = utils.load('data/fma_metadata/echonest.csv')

np.testing.assert_array_equal(features.index, tracks.index)
assert echonest.index.isin(tracks.index).all()

tracks.shape, genres.shape, features.shape, echonest.shape
Out[2]:
((106574, 52), (163, 4), (106574, 518), (13129, 249))

1 Metadata

The metadata table, a CSV file in the fma_metadata.zip archive, is composed of many colums:

  1. The index is the ID of the song, taken from the website, used as the name of the audio file.
  2. Per-track, per-album and per-artist metadata from the Free Music Archive website.
  3. Two columns to indicate the subset (small, medium, large) and the split (training, validation, test).
In [3]:
ipd.display(tracks['track'].head())
ipd.display(tracks['album'].head())
ipd.display(tracks['artist'].head())
ipd.display(tracks['set'].head())
bit_rate comments composer date_created date_recorded duration favorites genre_top genres genres_all information interest language_code license listens lyricist number publisher tags title
track_id
2 256000 0 NaN 2008-11-26 01:48:12 2008-11-26 168 2 Hip-Hop [21] [21] NaN 4656 en Attribution-NonCommercial-ShareAlike 3.0 Inter... 1293 NaN 3 NaN [] Food
3 256000 0 NaN 2008-11-26 01:48:14 2008-11-26 237 1 Hip-Hop [21] [21] NaN 1470 en Attribution-NonCommercial-ShareAlike 3.0 Inter... 514 NaN 4 NaN [] Electric Ave
5 256000 0 NaN 2008-11-26 01:48:20 2008-11-26 206 6 Hip-Hop [21] [21] NaN 1933 en Attribution-NonCommercial-ShareAlike 3.0 Inter... 1151 NaN 6 NaN [] This World
10 192000 0 Kurt Vile 2008-11-25 17:49:06 2008-11-26 161 178 Pop [10] [10] NaN 54881 en Attribution-NonCommercial-NoDerivatives (aka M... 50135 NaN 1 NaN [] Freeway
20 256000 0 NaN 2008-11-26 01:48:56 2008-01-01 311 0 NaN [76, 103] [17, 10, 76, 103] NaN 978 en Attribution-NonCommercial-NoDerivatives (aka M... 361 NaN 3 NaN [] Spiritual Level
comments date_created date_released engineer favorites id information listens producer tags title tracks type
track_id
2 0 2008-11-26 01:44:45 2009-01-05 NaN 4 1 <p></p> 6073 NaN [] AWOL - A Way Of Life 7 Album
3 0 2008-11-26 01:44:45 2009-01-05 NaN 4 1 <p></p> 6073 NaN [] AWOL - A Way Of Life 7 Album
5 0 2008-11-26 01:44:45 2009-01-05 NaN 4 1 <p></p> 6073 NaN [] AWOL - A Way Of Life 7 Album
10 0 2008-11-26 01:45:08 2008-02-06 NaN 4 6 NaN 47632 NaN [] Constant Hitmaker 2 Album
20 0 2008-11-26 01:45:05 2009-01-06 NaN 2 4 <p> "spiritual songs" from Nicky Cook</p> 2710 NaN [] Niris 13 Album
active_year_begin active_year_end associated_labels bio comments date_created favorites id latitude location longitude members name related_projects tags website wikipedia_page
track_id
2 2006-01-01 NaT NaN <p>A Way Of Life, A Collective of Hip-Hop from... 0 2008-11-26 01:42:32 9 1 40.058324 New Jersey -74.405661 Sajje Morocco,Brownbum,ZawidaGod,Custodian of ... AWOL The list of past projects is 2 long but every1... [awol] http://www.AzillionRecords.blogspot.com NaN
3 2006-01-01 NaT NaN <p>A Way Of Life, A Collective of Hip-Hop from... 0 2008-11-26 01:42:32 9 1 40.058324 New Jersey -74.405661 Sajje Morocco,Brownbum,ZawidaGod,Custodian of ... AWOL The list of past projects is 2 long but every1... [awol] http://www.AzillionRecords.blogspot.com NaN
5 2006-01-01 NaT NaN <p>A Way Of Life, A Collective of Hip-Hop from... 0 2008-11-26 01:42:32 9 1 40.058324 New Jersey -74.405661 Sajje Morocco,Brownbum,ZawidaGod,Custodian of ... AWOL The list of past projects is 2 long but every1... [awol] http://www.AzillionRecords.blogspot.com NaN
10 NaT NaT Mexican Summer, Richie Records, Woodsist, Skul... <p><span style="font-family:Verdana, Geneva, A... 3 2008-11-26 01:42:55 74 6 NaN NaN NaN Kurt Vile, the Violators Kurt Vile NaN [philly, kurt vile] http://kurtvile.com NaN
20 1990-01-01 2011-01-01 NaN <p>Songs written by: Nicky Cook</p>\n<p>VOCALS... 2 2008-11-26 01:42:52 10 4 51.895927 Colchester England 0.891874 Nicky Cook\n Nicky Cook NaN [instrumentals, experimental pop, post punk, e... NaN NaN
split subset
track_id
2 training small
3 training medium
5 training small
10 training small
20 training large

1.1 Subsets

The small and medium subsets can be selected with the below code.

In [4]:
small = tracks[tracks['set', 'subset'] <= 'small']
small.shape
Out[4]:
(8000, 52)
In [5]:
medium = tracks[tracks['set', 'subset'] <= 'medium']
medium.shape
Out[5]:
(25000, 52)

2 Genres

The genre hierarchy is stored in genres.csv and distributed in fma_metadata.zip.

In [6]:
print('{} top-level genres'.format(len(genres['top_level'].unique())))
genres.loc[genres['top_level'].unique()].sort_values('#tracks', ascending=False)
16 top-level genres
Out[6]:
#tracks parent title top_level
genre_id
38 38154 0 Experimental 38
15 34413 0 Electronic 15
12 32923 0 Rock 12
1235 14938 0 Instrumental 1235
10 13845 0 Pop 10
17 12706 0 Folk 17
21 8389 0 Hip-Hop 21
2 5271 0 International 2
4 4126 0 Jazz 4
5 4106 0 Classical 5
9 1987 0 Country 9
20 1876 0 Spoken 20
3 1752 0 Blues 3
14 1499 0 Soul-RnB 14
8 868 0 Old-Time / Historic 8
13 730 0 Easy Listening 13
In [7]:
genres.sort_values('#tracks').head(10)
Out[7]:
#tracks parent title top_level
genre_id
175 0 86 Bollywood 2
178 0 4 Be-Bop 4
377 1 19 Deep Funk 14
173 4 86 N. Indian Traditional 2
493 4 651 Western Swing 9
374 9 20 Banter 20
808 12 46 Salsa 2
174 17 86 South Indian Traditional 2
465 18 20 Musical Theater 20
176 23 2 Pacific 2

3 Features

  1. Features extracted from the audio for all tracks.
  2. For some tracks, data colected from the Echonest API.
In [8]:
print('{1} features for {0} tracks'.format(*features.shape))
columns = ['mfcc', 'chroma_cens', 'tonnetz', 'spectral_contrast']
columns.append(['spectral_centroid', 'spectral_bandwidth', 'spectral_rolloff'])
columns.append(['rmse', 'zcr'])
for column in columns:
    ipd.display(features[column].head().style.format('{:.2f}'))
518 features for 106574 tracks
statistics kurtosis max mean median min skew std
number 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20
track_id
2 3.86 1.54 0.00 0.33 0.12 -0.34 -0.26 0.15 0.41 -0.16 -0.03 0.43 -0.23 -0.30 -0.19 -0.05 -0.15 -0.00 0.08 0.00 28.66 215.54 52.42 103.29 54.60 85.16 37.84 58.17 30.03 39.13 27.74 37.24 34.15 33.54 30.84 28.61 32.68 22.62 27.04 21.43 -163.77 116.70 -41.75 29.14 -15.05 18.88 -8.92 12.00 -4.25 1.36 -2.68 -0.79 -6.92 -3.66 1.47 0.20 4.00 -2.11 0.12 -5.79 -143.59 124.86 -43.52 28.89 -13.50 19.18 -7.83 11.58 -3.64 1.09 -2.27 -0.95 -6.49 -3.21 1.11 0.03 3.86 -1.70 0.00 -5.59 -504.89 -0.00 -115.32 -51.57 -97.88 -41.52 -61.75 -39.68 -54.22 -42.56 -49.20 -35.53 -38.51 -33.41 -31.21 -32.89 -28.14 -32.31 -27.95 -34.33 -1.75 -1.19 0.32 0.04 -0.26 0.03 -0.27 0.07 -0.17 -0.10 -0.16 0.26 -0.10 0.03 0.02 -0.10 0.02 -0.17 0.07 -0.15 97.81 38.57 22.58 20.77 19.87 20.30 14.63 12.18 9.40 10.74 10.06 8.60 9.28 9.25 8.52 8.56 7.65 7.25 7.08 7.39
3 4.30 1.40 0.11 -0.21 0.03 -0.02 0.15 0.05 0.03 -0.06 0.51 0.37 0.21 0.10 0.48 0.27 0.11 0.09 0.31 0.06 29.38 207.70 76.74 137.25 53.94 105.26 55.66 59.43 36.57 40.32 26.74 49.43 25.32 20.49 37.79 27.83 32.36 29.40 30.73 31.93 -159.00 120.16 -33.23 47.34 -6.25 31.41 -5.26 11.62 -1.60 5.13 -3.42 6.95 -4.18 -3.53 0.27 -2.27 1.09 -2.34 0.47 -1.55 -140.04 128.24 -33.95 46.59 -6.29 33.55 -5.44 12.08 -1.31 5.82 -2.97 5.82 -4.06 -2.81 -0.63 -1.77 1.09 -2.19 0.63 -1.28 -546.27 -18.52 -85.01 -12.55 -87.05 -26.99 -61.85 -33.42 -47.01 -31.35 -46.10 -24.69 -35.11 -34.73 -34.03 -35.29 -27.16 -28.57 -29.91 -39.94 -1.78 -1.17 0.30 -0.03 0.05 -0.01 0.18 -0.11 0.08 -0.29 -0.27 0.49 -0.08 -0.32 0.58 -0.13 -0.22 0.09 -0.30 0.04 111.69 41.19 19.41 22.03 19.33 19.18 12.42 10.26 9.39 10.17 8.77 10.03 6.98 7.65 9.60 7.22 8.40 7.29 7.42 8.78
5 2.62 2.42 0.44 -0.78 -0.77 -0.72 0.09 0.15 0.26 -0.61 0.10 -0.25 0.16 0.64 0.19 0.29 -0.07 0.36 0.61 0.13 -40.50 218.97 50.37 112.31 51.51 66.55 29.27 57.99 48.33 43.88 27.20 32.23 37.52 35.24 45.28 33.54 31.72 30.78 36.51 24.97 -205.44 132.22 -16.09 41.51 -7.64 16.94 -5.65 9.57 0.50 8.67 -8.27 0.59 -0.34 2.38 7.90 1.95 7.44 -1.74 0.28 -5.49 -181.02 138.25 -14.51 43.08 -8.00 17.44 -3.93 9.79 0.27 8.91 -7.98 0.26 -0.11 2.25 7.51 1.87 7.34 -1.92 0.00 -5.60 -528.70 -62.28 -87.21 -24.32 -74.06 -30.45 -59.32 -33.44 -38.41 -26.08 -46.70 -35.25 -33.57 -34.80 -22.94 -33.81 -20.74 -29.76 -37.07 -30.42 -1.49 -1.26 -0.36 -0.11 0.02 0.04 -0.51 -0.02 -0.05 -0.09 -0.17 -0.03 0.11 -0.08 0.25 -0.04 0.06 0.30 0.21 0.12 95.05 39.37 18.87 24.32 23.16 17.16 13.05 10.91 9.78 11.81 8.86 9.58 8.90 8.14 8.20 7.78 7.13 7.54 8.45 7.33
10 5.08 1.16 2.10 1.37 -0.20 -0.35 -0.53 0.56 0.28 -0.15 -0.05 -0.19 0.02 0.11 0.18 -0.12 -0.03 -0.15 0.11 0.04 20.20 235.20 60.41 78.47 52.21 58.24 21.60 42.96 27.95 36.47 24.78 36.50 17.19 27.24 22.88 28.58 26.52 22.58 21.73 24.47 -135.86 157.04 -53.45 17.20 6.87 13.93 -11.75 8.36 -5.13 0.23 -5.42 1.68 -6.22 1.84 -4.10 0.78 -0.56 -1.02 -3.81 -0.68 -113.09 162.58 -59.37 18.40 5.17 15.09 -12.18 9.03 -4.42 0.40 -5.51 1.44 -5.89 1.79 -3.71 0.43 -0.35 -0.94 -3.77 -0.47 -537.59 0.00 -130.03 -50.58 -45.11 -45.51 -44.85 -43.02 -44.33 -40.17 -33.50 -27.47 -35.83 -27.61 -31.05 -29.61 -24.42 -27.49 -31.34 -25.62 -2.15 -0.95 1.33 -0.56 0.20 -0.33 0.03 -0.52 -0.32 -0.14 0.12 0.03 -0.17 -0.05 -0.26 0.08 0.00 0.05 -0.05 -0.08 102.74 44.41 29.07 14.01 15.49 16.69 10.71 11.68 9.72 11.37 8.29 7.99 7.08 6.97 7.07 7.27 7.05 6.93 6.43 6.19
20 11.88 4.09 0.00 1.52 0.18 0.34 0.37 0.07 -0.02 0.03 0.25 0.13 0.08 0.04 -0.05 0.06 0.32 0.35 0.69 0.43 -6.42 177.52 77.09 88.49 63.75 58.96 42.60 43.25 26.63 39.85 31.51 32.32 31.28 32.22 24.53 27.89 25.16 32.53 37.03 38.72 -135.14 114.81 12.35 19.76 18.67 19.64 3.57 12.12 -2.29 8.84 -0.81 4.08 0.21 3.88 -0.24 0.39 -0.57 2.78 2.43 3.03 -129.96 118.39 12.15 17.80 19.30 19.55 3.73 12.41 -1.77 8.90 -1.03 3.96 0.17 3.82 -0.12 0.65 -0.37 2.52 1.84 2.86 -484.60 -6.12 -53.16 -18.94 -37.71 -19.12 -33.69 -18.64 -35.90 -23.49 -31.45 -21.81 -25.53 -24.74 -27.98 -31.90 -29.10 -25.42 -22.94 -29.73 -2.40 -1.43 -0.04 0.92 -0.31 0.10 -0.09 -0.12 -0.24 -0.07 0.22 0.10 0.02 -0.01 -0.01 -0.15 -0.13 0.15 0.49 0.16 54.85 22.19 15.86 13.82 12.82 9.33 8.71 8.33 8.36 7.88 7.71 6.50 6.85 6.93 7.02 6.98 6.81 7.52 7.10 7.03
statistics kurtosis max mean median min skew std
number 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12 01 02 03 04 05 06 07 08 09 10 11 12
track_id
2 7.18 5.23 0.25 1.35 1.48 0.53 1.48 2.69 0.87 1.34 1.35 1.24 0.69 0.57 0.60 0.63 0.57 0.44 0.49 0.50 0.57 0.58 0.62 0.59 0.47 0.37 0.24 0.23 0.22 0.22 0.23 0.25 0.20 0.18 0.20 0.32 0.48 0.39 0.25 0.24 0.23 0.23 0.23 0.25 0.20 0.17 0.20 0.31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -2.00 -1.81 -0.35 -0.44 -0.57 -0.44 -0.15 -0.66 0.10 0.57 0.36 -0.44 0.11 0.09 0.09 0.08 0.07 0.08 0.08 0.07 0.08 0.09 0.09 0.10
3 1.89 0.76 0.35 2.30 1.65 0.07 1.37 1.05 0.11 0.62 1.04 1.29 0.68 0.58 0.58 0.58 0.45 0.46 0.54 0.66 0.51 0.53 0.60 0.55 0.23 0.23 0.23 0.22 0.22 0.24 0.37 0.42 0.31 0.24 0.26 0.23 0.23 0.23 0.21 0.20 0.23 0.26 0.39 0.44 0.31 0.24 0.26 0.23 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.64 0.49 0.79 1.27 -0.94 -0.89 -1.09 -0.86 -0.46 0.39 0.44 0.37 0.08 0.08 0.10 0.09 0.08 0.10 0.08 0.10 0.09 0.08 0.08 0.07
5 0.53 -0.08 -0.28 0.69 1.94 0.88 -0.92 -0.93 0.67 1.04 0.27 1.13 0.61 0.65 0.49 0.45 0.47 0.45 0.50 0.56 0.67 0.61 0.55 0.60 0.26 0.30 0.25 0.22 0.25 0.24 0.28 0.29 0.35 0.29 0.25 0.22 0.26 0.29 0.25 0.22 0.25 0.24 0.28 0.29 0.35 0.28 0.24 0.21 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -0.26 0.05 -0.04 -0.17 -0.72 -0.58 -0.27 -0.07 -0.25 0.42 0.11 0.55 0.09 0.13 0.10 0.07 0.07 0.08 0.13 0.13 0.11 0.10 0.09 0.09
10 3.70 -0.29 2.20 -0.23 1.37 1.00 1.77 1.60 0.52 1.98 4.33 1.30 0.46 0.54 0.45 0.65 0.59 0.51 0.65 0.52 0.51 0.48 0.64 0.64 0.23 0.29 0.24 0.23 0.19 0.29 0.41 0.35 0.27 0.24 0.27 0.24 0.23 0.28 0.23 0.23 0.21 0.28 0.44 0.36 0.27 0.24 0.25 0.23 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 -0.29 0.00 0.36 0.21 -0.09 -0.06 -0.79 -0.89 -0.52 -0.49 1.08 0.55 0.05 0.10 0.06 0.12 0.08 0.07 0.08 0.07 0.10 0.08 0.07 0.09
20 -0.19 -0.20 0.20 0.26 0.78 0.08 -0.29 -0.82 0.04 -0.80 -0.99 -0.43 0.65 0.68 0.67 0.60 0.65 0.70 0.66 0.69 0.64 0.67 0.69 0.68 0.20 0.25 0.26 0.19 0.18 0.24 0.28 0.29 0.25 0.29 0.30 0.24 0.20 0.22 0.23 0.19 0.16 0.24 0.26 0.27 0.24 0.27 0.29 0.24 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.46 0.62 0.78 0.56 0.95 0.36 0.28 0.23 0.30 0.17 0.14 0.08 0.13 0.16 0.15 0.11 0.13 0.13 0.15 0.17 0.11 0.16 0.18 0.14
statistics kurtosis max mean median min skew std
number 01 02 03 04 05 06 01 02 03 04 05 06 01 02 03 04 05 06 01 02 03 04 05 06 01 02 03 04 05 06 01 02 03 04 05 06 01 02 03 04 05 06
track_id
2 2.30 0.98 1.03 1.67 0.83 8.46 0.10 0.16 0.21 0.32 0.06 0.07 -0.00 0.02 0.01 0.07 0.01 0.02 -0.00 0.02 0.01 0.07 0.01 0.02 -0.06 -0.09 -0.19 -0.14 -0.05 -0.09 0.75 0.26 0.20 0.59 -0.18 -1.42 0.02 0.03 0.04 0.05 0.01 0.01
3 2.00 1.38 0.87 2.02 0.43 0.48 0.18 0.11 0.27 0.21 0.07 0.07 0.00 0.01 0.05 -0.03 0.02 -0.00 -0.00 0.01 0.05 -0.02 0.02 -0.00 -0.10 -0.08 -0.16 -0.30 -0.02 -0.06 0.27 -0.13 0.17 -0.99 0.57 0.56 0.03 0.02 0.05 0.06 0.01 0.02
5 10.77 0.92 -0.19 0.53 0.15 7.70 0.25 0.09 0.19 0.18 0.07 0.08 -0.01 -0.02 -0.03 0.02 0.00 -0.00 -0.01 -0.02 -0.02 0.02 0.00 -0.00 -0.13 -0.13 -0.36 -0.17 -0.04 -0.15 1.21 0.22 -0.42 -0.01 -0.20 -0.93 0.03 0.02 0.08 0.04 0.01 0.01
10 0.50 2.95 0.09 3.00 4.28 0.35 0.06 0.10 0.32 0.19 0.12 0.06 -0.02 -0.02 -0.00 -0.07 0.01 0.01 -0.02 -0.02 -0.01 -0.07 0.01 0.01 -0.11 -0.19 -0.27 -0.34 -0.05 -0.03 -0.14 -0.28 0.02 -1.09 1.16 0.25 0.02 0.03 0.09 0.07 0.02 0.01
20 1.11 4.17 0.25 0.16 0.52 0.46 0.17 0.19 0.35 0.29 0.10 0.07 0.01 0.01 -0.02 -0.08 0.01 -0.01 0.01 0.01 -0.03 -0.07 0.01 -0.01 -0.15 -0.21 -0.34 -0.39 -0.08 -0.09 0.19 -0.22 0.08 0.04 0.23 -0.21 0.03 0.04 0.11 0.10 0.02 0.02
statistics kurtosis max mean median min skew std
number 01 02 03 04 05 06 07 01 02 03 04 05 06 07 01 02 03 04 05 06 07 01 02 03 04 05 06 07 01 02 03 04 05 06 07 01 02 03 04 05 06 07 01 02 03 04 05 06 07
track_id
2 2.27 0.45 0.76 0.08 0.01 7.25 3.26 50.83 41.39 39.33 31.51 33.35 47.27 54.69 18.01 15.36 17.13 17.16 18.09 17.62 38.27 17.58 15.03 16.84 16.99 17.82 17.26 39.83 2.67 2.30 3.39 3.75 5.48 9.66 10.45 0.89 0.49 0.61 0.37 0.43 1.69 -1.57 4.54 4.32 3.94 3.14 3.31 3.10 7.62
3 3.21 0.24 -0.01 -0.28 0.25 1.28 3.73 59.70 40.60 42.14 31.47 38.78 37.24 58.20 15.73 15.05 17.37 17.21 18.18 18.90 39.20 15.31 14.71 17.17 17.20 17.86 18.70 40.40 0.39 0.65 1.84 1.84 1.84 1.84 1.84 1.00 0.47 0.33 0.10 0.47 0.39 -1.46 4.43 4.52 4.63 3.75 4.09 3.35 7.61
5 1.48 0.36 0.46 0.08 -0.11 0.77 0.93 50.93 38.54 39.02 33.71 32.75 38.79 51.90 17.10 15.97 18.65 16.97 17.29 19.26 36.41 16.49 15.86 18.53 16.88 16.94 19.11 39.10 3.42 2.31 3.42 7.14 9.32 7.23 8.57 0.82 0.20 0.21 0.25 0.48 0.37 -1.27 4.94 4.38 4.26 3.19 3.10 3.09 8.49
10 2.24 0.74 1.61 0.46 -0.14 1.70 -0.40 58.62 37.43 45.41 28.71 33.74 46.30 52.13 19.18 14.28 15.51 16.54 20.32 18.97 34.89 18.23 13.76 15.09 16.40 20.16 17.75 36.59 4.17 4.42 6.24 3.98 10.01 9.76 11.21 1.11 0.73 0.85 0.37 0.19 1.26 -0.59 5.56 4.01 3.69 2.65 3.46 4.69 8.40
20 2.26 0.32 0.63 0.10 0.37 0.70 11.45 46.74 42.15 39.38 33.53 36.35 28.78 48.10 16.04 16.16 17.70 18.38 18.42 16.48 19.00 15.57 15.87 17.36 18.23 18.13 16.39 17.97 4.46 2.98 4.15 5.92 8.20 10.09 11.15 0.96 0.43 0.55 0.28 0.49 0.38 2.97 4.24 4.32 3.92 3.17 3.11 1.76 4.10
feature spectral_centroid spectral_bandwidth spectral_rolloff
statistics kurtosis max mean median min skew std kurtosis max mean median min skew std kurtosis max mean median min skew std
number 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
track_id
2 2.41 5514.05 1639.58 1503.50 0.00 1.08 719.77 3.87 3451.11 1607.47 1618.85 0.00 -0.88 436.81 0.84 9410.01 3267.80 3143.85 0.00 0.35 1300.73
3 3.52 6288.43 1763.01 1517.99 0.00 1.65 972.76 2.38 3469.18 1736.96 1686.77 0.00 0.46 486.66 2.38 10002.17 3514.62 3413.01 0.00 1.12 1650.36
5 1.32 5648.61 1292.96 1186.51 0.00 0.94 665.32 0.90 3492.74 1512.92 1591.52 0.00 -0.66 474.41 -0.24 9442.31 2773.93 2863.92 0.00 0.27 1323.47
10 9.73 5739.39 1360.03 1180.97 0.00 2.52 668.70 0.44 3962.70 1420.26 1301.81 0.00 0.88 604.89 3.62 10056.01 2603.49 2002.59 0.00 1.80 1524.40
20 2.18 5540.21 1732.97 1640.78 123.61 0.96 481.93 1.69 3556.88 2489.02 2467.10 677.70 -0.14 339.70 -0.74 9496.14 4201.35 4166.67 75.37 0.16 1495.30
feature rmse zcr
statistics kurtosis max mean median min skew std kurtosis max mean median min skew std
number 01 01 01 01 01 01 01 01 01 01 01 01 01 01
track_id
2 2.50 14.75 3.19 2.65 0.00 1.57 2.54 5.76 0.46 0.09 0.07 0.00 2.09 0.06
3 -0.64 9.10 3.61 3.71 0.00 0.02 1.95 2.82 0.47 0.08 0.06 0.00 1.72 0.07
5 0.00 11.03 3.25 2.41 0.00 1.03 2.59 6.81 0.38 0.05 0.04 0.00 2.19 0.04
10 1.77 12.32 3.89 3.76 0.00 0.83 2.00 21.43 0.45 0.08 0.07 0.00 3.54 0.04
20 1.24 16.18 4.60 4.37 0.00 0.80 2.18 16.67 0.47 0.05 0.04 0.00 3.19 0.03

3.1 Echonest features

In [9]:
print('{1} features for {0} tracks'.format(*echonest.shape))
ipd.display(echonest['echonest', 'metadata'].head())
ipd.display(echonest['echonest', 'audio_features'].head())
ipd.display(echonest['echonest', 'social_features'].head())
ipd.display(echonest['echonest', 'ranks'].head())
249 features for 13129 tracks
album_date album_name artist_latitude artist_location artist_longitude artist_name release
track_id
2 NaN NaN 32.6783 Georgia, US -83.2230 AWOL AWOL - A Way Of Life
3 NaN NaN 32.6783 Georgia, US -83.2230 AWOL AWOL - A Way Of Life
5 NaN NaN 32.6783 Georgia, US -83.2230 AWOL AWOL - A Way Of Life
10 2008-03-11 Constant Hitmaker 39.9523 Philadelphia, PA, US -75.1624 Kurt Vile Constant Hitmaker
134 NaN NaN 32.6783 Georgia, US -83.2230 AWOL AWOL - A Way Of Life
acousticness danceability energy instrumentalness liveness speechiness tempo valence
track_id
2 0.416675 0.675894 0.634476 0.010628 0.177647 0.159310 165.922 0.576661
3 0.374408 0.528643 0.817461 0.001851 0.105880 0.461818 126.957 0.269240
5 0.043567 0.745566 0.701470 0.000697 0.373143 0.124595 100.260 0.621661
10 0.951670 0.658179 0.924525 0.965427 0.115474 0.032985 111.562 0.963590
134 0.452217 0.513238 0.560410 0.019443 0.096567 0.525519 114.290 0.894072
artist_discovery artist_familiarity artist_hotttnesss song_currency song_hotttnesss
track_id
2 0.388990 0.386740 0.406370 0.000000 0.000000
3 0.388990 0.386740 0.406370 0.000000 0.000000
5 0.388990 0.386740 0.406370 0.000000 0.000000
10 0.557339 0.614272 0.798387 0.005158 0.354516
134 0.388990 0.386740 0.406370 0.000000 0.000000
artist_discovery_rank artist_familiarity_rank artist_hotttnesss_rank song_currency_rank song_hotttnesss_rank
track_id
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN
10 2635.0 2544.0 397.0 115691.0 67609.0
134 NaN NaN NaN NaN NaN
In [10]:
ipd.display(echonest['echonest', 'temporal_features'].head())
x = echonest.loc[2, ('echonest', 'temporal_features')]
plt.plot(x);
000 001 002 003 004 005 006 007 008 009 ... 214 215 216 217 218 219 220 221 222 223
track_id
2 0.877233 0.588911 0.354243 0.295090 0.298413 0.309430 0.304496 0.334579 0.249495 0.259656 ... -1.992303 6.805694 0.233070 0.192880 0.027455 0.06408 3.67696 3.61288 13.316690 262.929749
3 0.534429 0.537414 0.443299 0.390879 0.344573 0.366448 0.419455 0.747766 0.460901 0.392379 ... -1.582331 8.889308 0.258464 0.220905 0.081368 0.06413 6.08277 6.01864 16.673548 325.581085
5 0.548093 0.720192 0.389257 0.344934 0.361300 0.402543 0.434044 0.388137 0.512487 0.525755 ... -2.288358 11.527109 0.256821 0.237820 0.060122 0.06014 5.92649 5.86635 16.013849 356.755737
10 0.311404 0.711402 0.321914 0.500601 0.250963 0.321316 0.734250 0.325188 0.373012 0.235840 ... -3.662988 21.508228 0.283352 0.267070 0.125704 0.08082 8.41401 8.33319 21.317064 483.403809
134 0.610849 0.569169 0.428494 0.345796 0.376920 0.460590 0.401371 0.449900 0.428946 0.446736 ... -1.452696 2.356398 0.234686 0.199550 0.149332 0.06440 11.26707 11.20267 26.454180 751.147705

5 rows × 224 columns

3.2 Features like MFCCs are discriminant

In [11]:
small = tracks['set', 'subset'] <= 'small'
genre1 = tracks['track', 'genre_top'] == 'Instrumental'
genre2 = tracks['track', 'genre_top'] == 'Hip-Hop'

X = features.loc[small & (genre1 | genre2), 'mfcc']
X = skl.decomposition.PCA(n_components=2).fit_transform(X)

y = tracks.loc[small & (genre1 | genre2), ('track', 'genre_top')]
y = skl.preprocessing.LabelEncoder().fit_transform(y)

plt.scatter(X[:,0], X[:,1], c=y, cmap='RdBu', alpha=0.5)
X.shape, y.shape
Out[11]:
((2000, 2), (2000,))

4 Audio

You can load the waveform and listen to audio in the notebook itself.

In [12]:
filename = utils.get_audio_path(AUDIO_DIR, 2)
print('File: {}'.format(filename))

x, sr = librosa.load(filename, sr=None, mono=True)
print('Duration: {:.2f}s, {} samples'.format(x.shape[-1] / sr, x.size))

start, end = 7, 17
ipd.Audio(data=x[start*sr:end*sr], rate=sr)
File: ./data/fma_small/000/000002.mp3
Duration: 29.98s, 1321967 samples
Out[12]: