It's common in the history of music that some genres and bands gain very high popularity and some aren't noticed at all. It's not really easy to find out which piece will gain an universal acclaim. Or is it? Maybe the popularity of the most known music is the intrinsic feature of that music itself?
In this project, we want to answer the question: to what extent music popularity can be identified solely based on the musical features, and what are the features that make music popular ? Is it possible to identify musical features that play a significant role in songs' popularity ? It's interesting to see whether there are parameters and patterns which lead to the increased music popularity. Needless to say, that this kind of information is priceless for the whole music industry, and for the musicians themselves, as the music which is more popular will, by definition, reach a broader audience and have a higher chance of selling better.
To conduct our study, we use the FMA: A Dataset for Music Analysis dataset. It is publicly available on Github and the files are stored on UNIL/SWITCH server.
For this study, we are making two main assumptions :
We claim that popular music has a lot of listens. Let's explain that by starting from the definition of "popular music". According to Gaynor Jones and Jay Rahn, "an obvious criterion for a music's popularity is the number of people who experience it: the more people involved, the more popular music" (Gaynor Jones and Jay Rahn (1977). Definitions of Popular Music: Recycled. The Journal of Aesthetic Education , Oct., 1977, Vol. 11, No. 4, pp. 79-92, Jstor.). That means that a music can be qualified as popular if it reaches a wide audience. However, we shouldn't be too quick with the definition and acknowledge that popularity more complex than that, that "various groups of people cultivate certain genres within a popular idiom". (Gaynor Jones and Jay Rahn (1977). Definitions of Popular Music: Recycled. The Journal of Aesthetic Education , Oct., 1977, Vol. 11, No. 4, pp. 79-92, Jstor.). This can be social class, geographical place, race, education, or other. We do not account for any of these in our study. We however try to make our analysis more fine-grained by studying popularity within genres, so that a music is popular if, within its genre, it has a lot of listens.
There would exist some intrinsic features of each music piece which participate in making it more or less popular than others. We mean by 'intrinsic features', features that are directly calculated from the audio sound wave - and its spectrogram. This approach inscribes our work in the still recent field of Hit Song Science (HSS) whose goal is to "understand better the relation between intrinsic characteristics of songs and their popularity, regardless of the complex and poorly understood mechanisms of human appreciation and social pressure" (Pachet, 2012). As a result of this project, we expect to identify several technical characteristics of the music which result in its success.
To answer the research question, we are first going to check our two assumptions. For assumption (1), we will conduct a survey in order to get a better intuition on how people are perceiving popular music, and how people like some subset of our dataset. We hope to find that music that people would listen to a lot are the ones that are popular, justifying assumption (1). To check on assumption (2), we implement a Deep Neural Network to see if it manages to predict popularity, because if it does then it's a good indication that there exists musical features that play a role in popularity. After that, we are going to come up with a popularity metric based on the data available. It will take into accound the number of listens of the song, but also of the album and the number of likes and comments. Then we identify discriminating features in music, and apply regression models for musics in each genre to detect the features that play a role in popularity. Finally, the biggest challenge is how we can interpret the features and make sense of them, trying to create listenable interpretations of our model, to answer accuratly our research question.
Throughout this study, we will take particular care to pinpoint the limitations of each step that we take. On the survey, we acknowledge that it has some crucial limitations, and we will try to explain the choice of our questions as precisely as possible, and describing how they will help us answer our research question. On the model interpretation, one of the biggest drawback is that the computed features are not really easily interpretable. That's why we try an approach to make it listenable, that we will develop further later in the notebook.
from zipfile import ZipFile
from tqdm.notebook import tqdm
from pqdm.processes import pqdm
from IPython import display
import librosa
import soundfile as sf
import torchaudio
import seaborn as sns
import pandas as pd
import numpy as np
import os
import ast
import statsmodels.api as sm
import statsmodels.formula.api as smf
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.decomposition import PCA
from scipy.stats import chisquare
from scipy.stats import shapiro
from nltk import agreement
import matplotlib.pyplot as plt
from scipy.cluster import hierarchy as hc
from scipy import stats
from sklearn import preprocessing
import torch
import torch.nn as nn
import torch.nn.functional as F
from transformers import Wav2Vec2PreTrainedModel, Wav2Vec2Model, Wav2Vec2FeatureExtractor
from transformers.modeling_outputs import SequenceClassifierOutput
from torch.utils.data import DataLoader
from audioLIME.data_provider import DataProvider
from audioLIME.factorization import DataBasedFactorization
from audioLIME import lime_audio
from spleeter.separator import Separator
plt.rcParams['figure.figsize'] = (17, 5)
# You should have fma_large and fma_metadata updated in this directory for the code to run properly
datasource = "data"
# Load metadata and features.
# Function based on: https://github.com/mdeff/fma/blob/master/utils.py
tracks = pd.read_csv(f'{datasource}/fma_metadata/tracks.csv', index_col=0, header=[0, 1])
COLUMNS = [('track', 'tags'), ('album', 'tags'), ('artist', 'tags'),
('track', 'genres'), ('track', 'genres_all')]
for column in COLUMNS:
tracks[column] = tracks[column].map(ast.literal_eval)
COLUMNS = [('track', 'date_created'), ('track', 'date_recorded'),
('album', 'date_created'), ('album', 'date_released'),
('artist', 'date_created'), ('artist', 'active_year_begin'),
('artist', 'active_year_end')]
for column in COLUMNS:
tracks[column] = pd.to_datetime(tracks[column])
SUBSETS = ('small', 'medium', 'large')
try:
tracks['set', 'subset'] = tracks['set', 'subset'].astype(
'category', categories=SUBSETS, ordered=True)
except (ValueError, TypeError):
# the categories and ordered arguments were removed in pandas 0.25
tracks['set', 'subset'] = tracks['set', 'subset'].astype(
pd.CategoricalDtype(categories=SUBSETS, ordered=True))
COLUMNS = [('track', 'genre_top'), ('track', 'license'),
('album', 'type'), ('album', 'information'),
('artist', 'bio')]
for column in COLUMNS:
tracks[column] = tracks[column].astype('category')
tracks.shape
(106574, 52)
To define popularity, we will use a score, instead of 'popular' or 'unpopular', as it is more accurate to talk about levels of popularity, rather than a binary classification. (Russel B. Nye, "Notes for an Introduction to a Discussion of Popular Culture, Journal of Popular Culture, Vol. 4 (Spring 1971):1031-38, on the need to recognize degrees of popularity).
The dataset contains different measures that can be used to define popularity: number of listens, likes and comments. These features are present at two levels (song and album). The two levels come from Free Music Archive's API and both encompass important information concerning the dispersion of the songs. Thus, we first build a DataFrame containing all these scores for all of the songs available in the dataset.
POP_FEATURES = ["listens", "favorites", "comments"]
pop_df = tracks["track"][["date_created", "genre_top"]+POP_FEATURES]
# add informations collected at the "album level"
for pop_feat in POP_FEATURES:
pop_df["album_"+pop_feat] = [count for count in tracks["album"][pop_feat]]
pop_df.sample(3)
date_created | genre_top | listens | favorites | comments | album_listens | album_favorites | album_comments | |
---|---|---|---|---|---|---|---|---|
track_id | ||||||||
124070 | 2015-08-24 16:32:25 | NaN | 353 | 1 | 0 | 11139 | 0 | 0 |
48838 | 2011-06-11 15:47:29 | NaN | 444 | 0 | 0 | 6043 | 1 | 3 |
89888 | 2013-08-30 16:15:29 | NaN | 32669 | 18 | 0 | 65976 | 1 | 0 |
The number of listens per song and per album reflect different realities, in that they are not directly based on one another (the number of listens of an album is not the sum of the number of listens of its songs and the number of listens of the songs don't take into account the number of listens of the album). In other words, if $l_s$ is the number of listens of a song, $a$ refers to albums and $\{s\in a\}$ the set of songs in one album, $$ \sum_{s\in a}l_s \neq l_a $$
This fact is illustrated below and two extreme examples are underlined.
# build a DataFrame of albums
albums = tracks["album"].groupby(["id"]).mean()[["listens"]]
# remove songs without album
albums = albums[albums.listens > 0]
albums["sum_songs"] = 0
albums["album"] = 0
albums["songs_minus_album"] = 0
albums["songs_ratio_album"] = 0
for alb_id, _ in tqdm(albums.iterrows(), total=albums.shape[0]):
# retrieve track ids for the album
curr_track_ids = tracks["album"][tracks["album"].id == alb_id].index.values
# compute sum of the listens of the album's songs
curr_sum_songs_listens = pop_df.loc[curr_track_ids].listens.sum()
# retrieve album's number of listens
curr_album_listens = pop_df.loc[curr_track_ids].album_listens.iloc[0]
# add these information to the `albums` DataFrame
albums.loc[alb_id, "sum_songs"] = curr_sum_songs_listens
albums.loc[alb_id, "album"] = curr_album_listens
# compute difference and ratio
albums.loc[alb_id, "songs_minus_album"] = curr_sum_songs_listens-curr_album_listens
albums.loc[alb_id, "songs_ratio_album"] = curr_sum_songs_listens/curr_album_listens
albums = albums.sort_values(by="songs_minus_album")
HBox(children=(FloatProgress(value=0.0, max=14395.0), HTML(value='')))
# histogram of the sum of album's songs' listens - album's listens
albums["songs_minus_album"].hist(bins=100)
plt.xlabel(r"$(\sum_{s\in a}l_s) - l_a$", fontsize=15)
plt.semilogy()
plt.show()
id_album = albums["songs_minus_album"].idxmin() #10953
id_tracks = tracks["album"][tracks["album"].id == id_album].index.values
print("The album '{album_title}' containing {n_songs} song{plural} was listened {alb_listens} times as an album, \
but individually the song{plural} account{sing} for {more_or_less} listens: {songs_sum_listens}."\
.format(album_title=tracks["album"].loc[id_tracks].iloc[0].title
, n_songs=len(id_tracks)
, plural='s' if len(id_tracks)>1 else ''
, sing='' if len(id_tracks)>1 else 's'
, alb_listens=pop_df.loc[id_tracks].iloc[0].album_listens
, songs_sum_listens=np.sum(pop_df.loc[id_tracks].listens)
, more_or_less='more'
if (np.sum(pop_df.loc[id_tracks].listens)/pop_df.loc[id_tracks].iloc[0].album_listens)>1
else 'less'
)
)
The album 'Music For Media Vol. 3' containing 8 songs was listened 247274 times as an album, but individually the songs account for less listens: 72202.
id_album = albums["songs_minus_album"].idxmax() #7690
id_tracks = tracks["album"][tracks["album"].id == id_album].index.values
print("The album '{album_title}' containing {n_songs} song{plural} was listened {alb_listens} times as an album, \
but individually the song{plural} account{sing} for {more_or_less} listens: {songs_sum_listens}."\
.format(album_title=tracks["album"].loc[id_tracks].iloc[0].title
, n_songs=len(id_tracks)
, plural='s' if len(id_tracks)>1 else ''
, sing='' if len(id_tracks)>1 else 's'
, alb_listens=pop_df.loc[id_tracks].iloc[0].album_listens
, songs_sum_listens=np.sum(pop_df.loc[id_tracks].listens)
, more_or_less='more'
if (np.sum(pop_df.loc[id_tracks].listens)/pop_df.loc[id_tracks].iloc[0].album_listens)>1
else 'less'
)
)
The album '...Plays Guitar' containing 7 songs was listened 386403 times as an album, but individually the songs account for more listens: 690479.
The number of listens, likes and comments of albums bring supplementary, non negligible, information about the public a song could have reached. However some songs don't appear in albums (reflected by $-1$ for album listens, comments, and favorites counts in the DataFrame). These songs represent appproximately 3.3% of the dataset, so they are left aside to be able to compute a more coherent popularity score based on both of the disposable levels.
# songs without albums have -1 as count for listens, favorites, comments
no_album = pop_df.album_listens < 0
print("{0:.2f}% songs have no album.".format(100*np.sum(no_album)/len(pop_df)))
3.31% songs have no album.
# keeping only songs with album informations
pop_df = pop_df[~no_album].copy()
pop_df.shape
(103045, 8)
The distribution of the logarithmic transform of the different features are plotted below, showing that they reflect different realities and justifying that the album level should not be neglected, as it appears to be largely used by FMA's users.
def log_transform(serie):
"""log transform for serie of non-negative values"""
return np.log((serie)+1)
f, axes = plt.subplots(nrows=1,ncols=3, sharex=False)
for ax, feat in zip(axes, POP_FEATURES) :
ax.hist(log_transform(pop_df["album_"+feat]), bins=100, alpha=0.6, label="album")
ax.hist(log_transform(pop_df[feat]), bins=100, alpha=0.6, label="song")
ax.set_xlabel(r"$\log(1+$"+feat+"$)$")
ax.legend()
ax.semilogy()
plt.legend()
plt.show()
The popularity score can then be computed as the one-dimensional projection of the logarithmic transformation of these features using PCA to encompass the maximum informations of these different dimensions in a single value per song.
This popularity score can then be thought of as a proxy of the ability of a song to reach a great audience. The number of likes and comments add informations about the engagement of publics towards the songs. It could potentially embrace social dynamics, an hypothesis could be that users that have liked or commented a song are more prone to recommend or play this song to their entourage, thus spreading the song.
# without `interest` but with album scores
X = pop_df[POP_FEATURES+["album_"+pop_feat for pop_feat in POP_FEATURES]].copy()
for pop_feat in X.columns:
# using log transform for the pca
X[pop_feat] = log_transform(X[pop_feat])
pca = PCA(n_components=1, svd_solver="full").fit(X)
#computing popularity score as a 1D projection of the popularity features
pop_df["pop_score"] = pca.transform(X)
# display the weight of each component in the projection
plt.bar(X.columns
, (pca.components_)[0]
)
plt.show()
The above plot represents the weights of the different features (or their associated coefficients) in the projection the 6 dimensions on a 1D axis representing the popularity score.
# plot the distribution of the 1D-popularity projection, and showcase its Shapiro-Wilk score
W, p_val = shapiro(pop_df.pop_score)
plt.hist(pop_df.pop_score, bins=100, label=r"$W_{Shapiro-Wilk}"+"={0:.3f}$".format(W))
plt.legend(fontsize=12)
plt.show()
/home/mbien/.local/lib/python3.8/site-packages/scipy/stats/morestats.py:1676: UserWarning: p-value may not be accurate for N > 5000. warnings.warn("p-value may not be accurate for N > 5000.")
The popularity scores obtained after the 1D PCA projection follow a gaussian distribution (as underlined by the Shapiro-Wilk score close to 1). This measure is then $z$-score normalized to make it follow a normal law $\mathcal{N}(0,1)$.
# as the distribution is ~ normal (shapiro close to 1), standardize the score (z-score normalization)
pop_df["pop_score"] = (pop_df["pop_score"]-np.mean(pop_df["pop_score"]))/(np.std(pop_df["pop_score"]))
pop_df.sample(3)
date_created | genre_top | listens | favorites | comments | album_listens | album_favorites | album_comments | pop_score | |
---|---|---|---|---|---|---|---|---|---|
track_id | |||||||||
125532 | 2015-09-25 16:18:52 | Electronic | 1352 | 0 | 0 | 4473 | 0 | 0 | -0.275386 |
85157 | 2013-05-30 15:50:26 | Experimental | 230 | 2 | 0 | 30149 | 4 | 11 | 0.213893 |
44980 | 2011-03-04 13:55:33 | Experimental | 102 | 1 | 0 | 1292 | 0 | 0 | -1.407466 |
from sklearn.model_selection import train_test_split
train, test = train_test_split(pop_df.pop_score, test_size=0.05)
train.to_csv("train.csv")
test.to_csv("test.csv")
Now we are going to proceed to the regression analysis. The features of our dataset are at a number of 518. There are :
As it is computationnaly too expensive to compute a regression with all the features, we are need to reduce the dimensionality of each data point. To do that, we observe that some features are potentially strongly correlated, and including both of them in the analysis would be unnecessary. For example, the mean ZCR and median ZCR are likely to be somehow linked. Our assumption is that we can find a smaller set of features that describe well enough our dataset.
To find the features, we will compute the Pearson's correlation between each feature, pairwise. It gives us information about the joint variablitiy of two random variables, showing linear relationships between them. For two random variables X and Y, the correlation is computed as follow :
$$corr(X,Y) = \frac{cov(X, Y)}{std(X) * std(Y)} $$Where the covariance is defined as : $$cov(X,Y) = \frac{\sum(x_i - \overline{x})(y_i - \overline{y})}{N}$$
A first limitation of this approach is that we only account for linear relationships, not more complex ones.
features = pd.read_csv(f'{datasource}/fma_metadata/features.csv', index_col=0, header=[0, 1, 2])
features.sample(3)
feature | chroma_cens | ... | tonnetz | zcr | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
statistics | kurtosis | ... | std | kurtosis | max | mean | median | min | skew | std | |||||||||||
number | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | ... | 04 | 05 | 06 | 01 | 01 | 01 | 01 | 01 | 01 | 01 |
track_id | |||||||||||||||||||||
106701 | -0.657624 | -0.739333 | 0.846104 | -0.294864 | 0.495813 | 0.117134 | -1.324070 | -1.159672 | 0.909491 | -0.578040 | ... | 0.042412 | 0.019442 | 0.018582 | 7.820217 | 0.467285 | 0.070751 | 0.055664 | 0.0 | 2.517603 | 0.065010 |
82862 | 2.100222 | -0.437557 | 0.268126 | -0.700742 | -0.670827 | -0.105831 | 1.435882 | 2.523684 | -0.693280 | -0.222233 | ... | 0.100295 | 0.032827 | 0.025711 | 1.392835 | 0.161621 | 0.050412 | 0.048340 | 0.0 | 0.773927 | 0.019650 |
142205 | -0.740965 | -0.608202 | -0.559238 | -0.697892 | 0.193160 | -0.233519 | 1.581625 | -0.333326 | -0.999112 | -0.553771 | ... | 0.068810 | 0.017600 | 0.017043 | 16.877150 | 0.702148 | 0.063787 | 0.040039 | 0.0 | 3.296057 | 0.077015 |
3 rows × 518 columns
Let's compute the correlation matrix, and visualize it.
corr_matrix = features.corr()
fig, ax = plt.subplots(figsize=(10,10))
im = ax.imshow(corr_matrix.abs())
im.set_clim(0, 1)
ax.grid(False)
cbar = ax.figure.colorbar(im, ax=ax, format='% .2f')
plt.show()
That shows that some features are redundant, as not only the diagional has a green color.
Now, to select a meaningful set of features, that describes the data well enough, we are taking the following approach, greatly inspired by this. They summarize the approach this way :
We re-use the code from here for the nice visualisations. The code is in a separate notebook (milestone3_feature_clustering.ipynb) to avoid overload this one. We directly use here the appropriate features.
selected_features = ['chroma_stft_mean_01', 'chroma_stft_std_01', 'chroma_cqt_kurtosis_03', 'chroma_stft_max_01', 'chroma_stft_kurtosis_01', 'chroma_stft_kurtosis_03', 'chroma_cqt_mean_01', 'chroma_cqt_mean_07','chroma_cqt_kurtosis_01', 'chroma_cqt_kurtosis_08']
selected_features = selected_features + ['mfcc_max_01', 'mfcc_max_05', 'mfcc_kurtosis_03', 'mfcc_min_04', 'mfcc_skew_09', 'mfcc_max_02', 'mfcc_max_03', 'mfcc_skew_04', 'mfcc_skew_03', 'mfcc_kurtosis_01']
selected_features = selected_features + ['rmse_max_01', 'rmse_kurtosis_01', 'rmse_min_01']
selected_features = selected_features + ['spectral_contrast_max_01', 'spectral_contrast_max_07', 'spectral_contrast_kurtosis_01', 'spectral_contrast_min_01', 'spectral_rolloff_kurtosis_01', 'spectral_contrast_max_06', 'spectral_rolloff_max_01', 'spectral_contrast_kurtosis_06', 'spectral_contrast_kurtosis_07', 'spectral_contrast_skew_07']
selected_features = selected_features + ['tonnetz_max_01', 'tonnetz_kurtosis_01', 'tonnetz_mean_02', 'tonnetz_min_01', 'tonnetz_mean_01', 'tonnetz_mean_03', 'tonnetz_skew_03', 'tonnetz_skew_04', 'tonnetz_skew_05', 'tonnetz_skew_06']
features_flatten = features.copy()
features_flatten.columns = ['_'.join(col) for col in features.columns.values]
chroma = features[['chroma_stft', 'chroma_cqt', 'chroma_cens']]
chroma.columns = ['_'.join(col) for col in chroma.columns.values]
mfcc = features['mfcc']
mfcc.columns = ['_'.join(col) for col in mfcc.columns.values]
rmse = features['rmse']
rmse.columns = ['_'.join(col) for col in rmse.columns.values]
spectral = features[['spectral_contrast', 'spectral_rolloff', 'spectral_centroid', 'spectral_bandwidth']]
spectral.columns = ['_'.join(col) for col in spectral.columns.values]
tonnetz = features['tonnetz']
tonnetz.columns = ['_'.join(col) for col in tonnetz.columns.values]
zcr = features['zcr']
zcr.columns = ['_'.join(col) for col in zcr.columns.values]
# Normalizing selected features for regression
df_selected_features = features_flatten[selected_features]
x = df_selected_features.values
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df_selected_features_normalized = pd.DataFrame(x_scaled)
df_selected_features_normalized.index = tracks["track"]['genre_top'].index
df_selected_features_normalized = df_selected_features_normalized[df_selected_features_normalized.index.isin(pop_df["pop_score"].index)]
X = df_selected_features_normalized
y = pop_df["pop_score"]
We decide to use regression analysis to explore as a statistical method that allows us to examine the relationship between the variables of interest : popularity and musical features. Our dependent variable is the popularity score as computed previously, whereas the independent variables are all the features selected by the previous clustering.
import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.OLS(y,X) # Least Square Regression
results = model.fit()
results.summary()
Dep. Variable: | pop_score | R-squared: | 0.053 |
---|---|---|---|
Model: | OLS | Adj. R-squared: | 0.053 |
Method: | Least Squares | F-statistic: | 135.1 |
Date: | Mon, 24 May 2021 | Prob (F-statistic): | 0.00 |
Time: | 21:56:03 | Log-Likelihood: | -1.4339e+05 |
No. Observations: | 103045 | AIC: | 2.869e+05 |
Df Residuals: | 103001 | BIC: | 2.873e+05 |
Df Model: | 43 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 3.9925 | 0.530 | 7.528 | 0.000 | 2.953 | 5.032 |
0 | -0.1023 | 0.031 | -3.353 | 0.001 | -0.162 | -0.042 |
1 | 0.3910 | 0.038 | 10.241 | 0.000 | 0.316 | 0.466 |
2 | -1.7740 | 0.490 | -3.617 | 0.000 | -2.735 | -0.813 |
3 | -1.0914 | 0.485 | -2.253 | 0.024 | -2.041 | -0.142 |
4 | -0.9342 | 0.962 | -0.971 | 0.331 | -2.819 | 0.951 |
5 | 0.6150 | 0.587 | 1.048 | 0.294 | -0.535 | 1.765 |
6 | -0.5904 | 0.029 | -20.430 | 0.000 | -0.647 | -0.534 |
7 | -0.6721 | 0.028 | -24.160 | 0.000 | -0.727 | -0.618 |
8 | 0.2975 | 0.946 | 0.314 | 0.753 | -1.556 | 2.151 |
9 | -2.4952 | 0.757 | -3.298 | 0.001 | -3.978 | -1.012 |
10 | 0.4276 | 0.076 | 5.594 | 0.000 | 0.278 | 0.577 |
11 | -1.3332 | 0.041 | -32.821 | 0.000 | -1.413 | -1.254 |
12 | -1.6097 | 0.306 | -5.257 | 0.000 | -2.210 | -1.010 |
13 | 0.1920 | 0.035 | 5.445 | 0.000 | 0.123 | 0.261 |
14 | -0.4031 | 0.125 | -3.226 | 0.001 | -0.648 | -0.158 |
15 | 0.2963 | 0.048 | 6.227 | 0.000 | 0.203 | 0.390 |
16 | 1.1787 | 0.036 | 32.509 | 0.000 | 1.108 | 1.250 |
17 | -1.7490 | 0.127 | -13.763 | 0.000 | -1.998 | -1.500 |
18 | -2.2799 | 0.092 | -24.818 | 0.000 | -2.460 | -2.100 |
19 | -0.4638 | 0.527 | -0.881 | 0.379 | -1.496 | 0.568 |
20 | -0.1146 | 0.031 | -3.700 | 0.000 | -0.175 | -0.054 |
21 | -2.2557 | 0.770 | -2.928 | 0.003 | -3.766 | -0.746 |
22 | 0.5682 | 0.270 | 2.101 | 0.036 | 0.038 | 1.098 |
23 | -0.2043 | 0.032 | -6.365 | 0.000 | -0.267 | -0.141 |
24 | -0.1783 | 0.035 | -5.144 | 0.000 | -0.246 | -0.110 |
25 | -2.0996 | 0.605 | -3.468 | 0.001 | -3.286 | -0.913 |
26 | 0.4884 | 0.037 | 13.036 | 0.000 | 0.415 | 0.562 |
27 | 1.4398 | 0.335 | 4.304 | 0.000 | 0.784 | 2.096 |
28 | -0.0729 | 0.024 | -3.074 | 0.002 | -0.119 | -0.026 |
29 | -0.0052 | 0.034 | -0.152 | 0.879 | -0.072 | 0.061 |
30 | 0.2488 | 0.417 | 0.597 | 0.550 | -0.568 | 1.066 |
31 | -1.3787 | 0.107 | -12.888 | 0.000 | -1.588 | -1.169 |
32 | -0.5108 | 0.061 | -8.429 | 0.000 | -0.630 | -0.392 |
33 | -0.1981 | 0.044 | -4.461 | 0.000 | -0.285 | -0.111 |
34 | -0.3390 | 0.255 | -1.332 | 0.183 | -0.838 | 0.160 |
35 | 0.0248 | 0.087 | 0.285 | 0.775 | -0.146 | 0.195 |
36 | 0.1677 | 0.037 | 4.539 | 0.000 | 0.095 | 0.240 |
37 | 0.5722 | 0.082 | 6.982 | 0.000 | 0.412 | 0.733 |
38 | -0.1214 | 0.042 | -2.873 | 0.004 | -0.204 | -0.039 |
39 | -0.4156 | 0.136 | -3.045 | 0.002 | -0.683 | -0.148 |
40 | -0.1595 | 0.153 | -1.042 | 0.297 | -0.460 | 0.140 |
41 | 0.6291 | 0.141 | 4.453 | 0.000 | 0.352 | 0.906 |
42 | -0.4963 | 0.160 | -3.108 | 0.002 | -0.809 | -0.183 |
Omnibus: | 4399.539 | Durbin-Watson: | 0.396 |
---|---|---|---|
Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 5375.951 |
Skew: | 0.472 | Prob(JB): | 0.00 |
Kurtosis: | 3.600 | Cond. No. | 1.09e+03 |
The p-value for each term tests the null hypothesis that the coefficient is equal to zero (no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis. In other words, a predictor that has a low p-value is likely to be a meaningful addition to your model because changes in the predictor's value are related to changes in the response variable.
Let's look at which features are good predictors of popularity, features for which we reject the null hypothesis at a level of 0.05.
coeffs_results = results.params
pvalues_results = results.pvalues
coeffs_results.index = ['intercept'] + selected_features
pvalues_results.index = ['intercept'] + selected_features
coeffs_results[pvalues_results < 0.05].sort_values()
chroma_cqt_kurtosis_08 -2.495228 mfcc_skew_03 -2.279891 rmse_kurtosis_01 -2.255653 spectral_contrast_kurtosis_01 -2.099599 chroma_cqt_kurtosis_03 -1.773952 mfcc_skew_04 -1.749003 mfcc_kurtosis_03 -1.609724 spectral_contrast_kurtosis_07 -1.378706 mfcc_max_05 -1.333245 chroma_stft_max_01 -1.091432 chroma_cqt_mean_07 -0.672139 chroma_cqt_mean_01 -0.590365 spectral_contrast_skew_07 -0.510810 tonnetz_skew_06 -0.496266 tonnetz_skew_03 -0.415611 mfcc_skew_09 -0.403052 spectral_contrast_max_01 -0.204281 tonnetz_max_01 -0.198128 spectral_contrast_max_07 -0.178251 tonnetz_mean_03 -0.121386 rmse_max_01 -0.114562 chroma_stft_mean_01 -0.102282 spectral_contrast_max_06 -0.072909 tonnetz_min_01 0.167704 mfcc_min_04 0.191966 mfcc_max_02 0.296299 chroma_stft_std_01 0.391017 mfcc_max_01 0.427563 spectral_contrast_min_01 0.488448 rmse_min_01 0.568158 tonnetz_mean_01 0.572225 tonnetz_skew_05 0.629137 mfcc_max_03 1.178739 spectral_rolloff_kurtosis_01 1.439841 intercept 3.992458 dtype: float64
We have a total of 37 good predictors ! A first quick look at it tells us that the higher the chroma_cqt_kurtosis_08 is, the less popular a music is likely to be, whereas a high spectral_rolloff_kurtosis_01 would be in favor of popular songs. For the conclusions of the study, we will need to make sense of such features.
Now, be more precise. There is a wide variety of different genre, and analysing them all at once may not be the smartest thing to do. Hence we decide to analysze what makes music popular, by genre. We apply the same method of regression analysis as before, on the same set of features.
X = df_selected_features_normalized
genres = tracks["track"].groupby('genre_top').count()['number'].index.tolist()
def find_predictors(genre, p_val):
X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre]
y = pop_df["pop_score"][tracks['track']['genre_top'] == genre]
X = sm.add_constant(X)
model = sm.OLS(y,X)
results = model.fit()
coeffs_results = results.params
pvalues_results = results.pvalues
if len(coeffs_results.index) > len(selected_features) :
coeffs_results.index = ['const'] + selected_features
pvalues_results.index = ['const'] + selected_features
else :
coeffs_results.index = selected_features
pvalues_results.index = selected_features
return coeffs_results[pvalues_results < p_val].sort_values()
preds = [{g : find_predictors(g, 0.05).to_dict()} for g in genres]
<ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre] <ipython-input-130-146b8061075c>:2: UserWarning: Boolean Series key will be reindexed to match DataFrame index. X = df_selected_features_normalized[tracks["track"]['genre_top'] == genre]
df_merged = pd.DataFrame()
for p in preds:
df_merged = df_merged.merge(pd.DataFrame(p), left_index=True, right_index=True, how='outer')
To visualize the results in a compact and readable way, we display a heatmap. The darker the spot is, the more impactful in popularity the feature is. White spots mean that the feature is not statistically significant in determining popularity. We plot first the features that have a positive effect on popularity (the higher their value is, the higher the popularity is predicted), and then the ones that have a negative effect on popularity (the higher their value is, the lowest the popularity is predicted).
import seaborn as sns;
# Effect of positive coeffs
df_merged_pos = df_merged[df_merged>0]
fig, ax = plt.subplots(figsize=(10,10))
sns.heatmap(np.log(df_merged_pos), cmap=sns.cm.rocket_r, ax=ax)
# Effect of negative coeffs
df_merged_neg = df_merged[df_merged<0]
fig, ax = plt.subplots(figsize=(10,10))
sns.heatmap(np.log(abs(df_merged_neg)), cmap=sns.cm.rocket_r, ax=ax)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X.values, y.values, test_size=0.05, random_state=42)
rock_X = df_selected_features_normalized[tracks["track"]['genre_top'] == 'Rock']
rock_y = pop_df["pop_score"][tracks['track']['genre_top'] == 'Rock']
rock_X_train, rock_X_test, rock_y_train, rock_y_test = train_test_split(rock_X.values, rock_y.values, test_size=0.05, random_state=42)
<ipython-input-140-cc1a6984180b>:4: UserWarning: Boolean Series key will be reindexed to match DataFrame index. rock_X = df_selected_features_normalized[tracks["track"]['genre_top'] == 'Rock']
print("All:", y_test@y_test)
print("Rock:", rock_y_test@rock_y_test)
All: 5005.8761226866345 Rock: 527.7648600853961
model_lr = sm.OLS(y_train,X_train) # Least Square Regression
results = model_lr.fit()
y_hat = results.predict(X_test)
dot = (y_hat - y_test)
dot@dot
4814.083453926919
model_lr = sm.OLS(rock_y_train, rock_X_train) # Least Square Regression
results = model_lr.fit()
y_hat = results.predict(rock_X_test)
dot = (y_hat - rock_y_test)
dot@dot
343.3193589983109
[x for x in zip(y_hat, rock_y_test)]
[(-0.4643162501709588, -1.5041084651833043), (-0.6034595223125281, -0.5955115530162234), (-0.4492528735597498, -0.24542866811475567), (-0.6878907340822157, -0.8963904181358582), (-0.3016233202664896, 1.0499587242700754), (-0.7804491816803061, -0.4385646372772147), (-0.41714173213889477, 0.1852191138921789), (-0.18675136358534195, -0.21625910866509027), (-0.7746833990596517, -1.4287716292047918), (-0.636527509901808, -0.7070515508004204), (-0.5479993217482375, -0.8596813648718226), (-0.5632875750334145, 0.3780966471310925), (-0.20676168463363304, -0.6924939455633519), (-0.5615461175469549, -1.4107769182983223), (-0.6752485534594495, -0.29220471046698215), (-0.4652215586850874, 0.4865922905808367), (-0.4160500794008979, -0.7204191764088058), (-0.4237791409033971, -0.4580615391277741), (-0.41374327010618783, -0.4258864844384222), (-0.3451870790456249, -0.42510934198612627), (-0.3738174816833705, -1.4020529655158018), (-0.4743935808521462, -1.24085344313225), (-0.5119624296780665, -1.429830959881009), (-0.5460889984270026, 1.353014539072024), (-0.15394967682820657, -0.49761447355986566), (-0.5647188448498348, -0.4775260332884997), (-0.49245826701760076, -1.471062631020395), (-0.5410122363293378, -0.9994775613659991), (-0.38187230085707435, -0.4281002757706902), (-0.35130680534581704, -0.5565818961651384), (-0.5348372915857174, -0.8169663627044441), (-0.20820766884883196, -0.030890403696188024), (-0.6475403471866912, 0.7529824908493825), (-0.33573320620047226, -0.45502485168428136), (-0.5139674880519869, -0.14074929304667488), (-0.492765591798407, -0.281442797994692), (-0.1165454228746832, 0.08051825454220674), (-0.3621078458915292, 0.06457777266269701), (-0.25283629714373546, -0.5308077811967117), (-0.5959479687091943, -1.3193641944137255), (-0.44323072603654157, -0.38762511173382763), (-0.34109126363114384, -1.5104142634726807), (-0.40425508922104647, -0.7697391962564902), (-0.6798825473850318, -0.8514239818474721), (-0.522802610067353, 0.7256384045077665), (-0.5554875184483747, 0.02160242236937828), (-0.1797794660473691, -0.4125613496182672), (-0.7147521731797886, -0.7883209067920491), (-0.36554477709119804, -0.3456830160848703), (-0.5124950755698623, -1.4524950938958772), (-0.513306800612223, -1.2783970265494713), (-0.6112801959525969, -0.1322092010543786), (-0.4059801777558, -0.6174013505480358), (-0.46577972567773257, -0.7463253821506076), (-0.41210066810656787, -0.5515964659311247), (-0.5228818822726957, -0.5441470236277278), (-0.8027231018624171, -0.7695976609941334), (-0.8785138236265212, -1.2258895470863926), (-0.710586531456829, -0.5631867487404117), (-0.4687664191830537, -2.459487320171356), (-0.3937472184839811, 0.028072543652525726), (-0.5447019082469176, -0.7582561074511491), (-0.1844231035803198, 0.33181377108242094), (-0.559133576342886, -0.9235237886118309), (-0.7036927387928464, 0.27536611983089676), (-0.3424995508080486, -0.24140583567824297), (-0.4414635704314681, -0.20005210072438034), (-0.6208098659873114, -1.1017998002924487), (-0.4637203847102845, -0.9735835679981412), (-0.2967208741478717, -0.8468782170910597), (-0.6062887848498686, -0.606253946362644), (-0.7437468107270389, -0.7326567772713889), (-0.3119626948560821, -1.2465412710683212), (-0.15654733951169775, -1.5389557712245567), (-0.3434627810832796, -0.5672116876363945), (-0.77969326673888, -0.55598972174127), (-0.4109500014182042, -0.23903684646905515), (-0.7149001815699539, 0.47648703685023314), (-0.3889188587872387, 0.1407839377559586), (-0.3388406494724613, -0.36429604159876394), (-0.22274964102023553, 0.45131066338086395), (-0.5887642931788187, 0.7323243474645669), (-0.688689688488009, 0.07727285544502722), (-0.431379460979643, -1.1548970819141764), (-0.37452259623506934, -0.7614449214429573), (-0.24049028275604228, 0.15733779287057276), (-0.2728865659327071, 0.27431848356196514), (-0.3312610890671651, -1.1990766534658155), (-0.42189581177069024, -0.16804736075470392), (-0.4641056815247692, -0.6300424708158584), (-1.0952207661579112, -0.5843520601373644), (-0.4688714150554495, -1.3882547653266102), (-0.7947554153006824, -0.9835191749769016), (-0.5597450926935736, -0.35939867828435845), (-0.624114321423195, -0.9813817050489098), (-0.47459094846056055, -1.4316779929300543), (-0.47978221342330674, -1.0000569379794195), (-0.5572017069298238, -1.4709744741306252), (-0.5840538169696936, -0.40918995475791275), (-0.6175192224679994, -1.7718921745894292), (-0.5946516944107412, -0.4000338790601163), (-0.703922681051839, -0.7501191241079388), (-0.22049031130617863, -0.21512803145426843), (-0.49748364563335457, -0.03908183926006559), (-0.3306543827530118, -0.19855799377433553), (-0.498814297816683, 0.023693175987205498), (-0.44654982631860013, -1.2324503265935132), (-0.45418015956778324, 0.14535908740049647), (-0.2609498621193328, 0.08991020939693284), (-0.16629889879859172, 0.7031682613618742), (-0.5331808653242769, -1.2372645328421457), (-0.6033023003007366, -0.9583866103609264), (-0.3586121425528245, 0.688457691850941), (-0.5334130398291734, -1.3234209095797358), (-0.6578444639141859, -1.6521733567891432), (-0.36398382092354636, 1.168176816119514), (-0.8135487017515124, -0.5992040947410382), (-0.5574417212941649, 0.14386263513981715), (-0.798455023733216, -1.0851661699115223), (-0.5115961241705267, -0.6113019020913713), (-0.45593533934482117, -1.7819905962127771), (-0.5832776135014129, -0.007935302756743815), (-0.08526575168557918, -0.29883580249647657), (-0.5748805902302195, 0.10964468819196738), (-0.6940238241145001, -1.3790825682125485), (-0.9611454364544164, -0.052248603283409835), (-0.6887687128573499, -0.1437257596104923), (-0.3519382438520429, -0.0020386349899706785), (-0.6750160601778685, 0.002894806226424274), (-0.7538623386931232, -1.267143539007827), (-0.05860537514322771, -0.6919626617772481), (-0.45064187998481586, 0.4118601870725456), (-0.28197563597655995, -0.05181856104617781), (-0.6352727630056392, -1.9420799206256611), (-0.5477965637917627, 0.0009518984872912492), (-0.5374672368160196, 0.48532443522114377), (-0.6073980651544634, -0.9930185899967439), (-0.291603834662849, -2.0203060617927693), (-0.6639527491211928, 0.24087174002593026), (-0.5218570483033012, 0.005782696067047482), (-0.38901820426029565, -0.21876515013515027), (-0.45434481590297926, 0.3223049039402026), (-0.44459796931919826, 0.40918831533968997), (-0.653474496112993, 0.6896711606251347), (-0.293067730218593, 0.16224124628924377), (-0.30605747001191996, -1.6362349970974293), (-0.36169428618636207, 0.015876581574156553), (-0.36427761400620684, -0.3112029282081994), (-0.49466429986910637, -0.27381413446878705), (-0.7216897320628184, -1.4949587007685516), (-0.8724817374287753, -0.6505405846764557), (-0.5258546380897903, -0.7165337625821278), (-0.8754601403325246, -0.7409174167165354), (-0.4609687986732583, 0.4908050431889926), (-0.3855917410175268, -0.8182320552394994), (-0.4249491770320368, -0.5416336528691459), (-0.639746503054826, -1.005390849215502), (-0.5162887799648265, 0.16954174169772523), (-0.478097204271437, -0.6934269017465563), (-0.40019778466159234, -0.2309762541429329), (-0.702568848123114, -0.47162978690326557), (-0.6102186202990438, -1.1131867967854085), (-0.541153183937041, -0.8098157972713056), (-0.6111075784878469, 0.10808993842411072), (-0.6384274016576839, 0.2789043175245217), (-0.5814427997534687, -0.8442416850688833), (-0.8375911336987772, -1.0953419651504053), (-0.2634125315138866, -1.3743972362069368), (-0.421856590614913, -0.5195399544102877), (-0.5993110263392876, 0.7711188547213601), (-0.47308617719382223, -0.033991790234329806), (-0.7486762199741922, -1.1745287934393152), (-0.5378378371375482, -1.0649188639284568), (-0.4259508450701703, -0.8553952473398243), (-0.9272455559854657, -1.7336401436530067), (-0.16956097046286317, -0.6078512821204217), (-0.2526342299428072, -0.2030328338974258), (-0.456384142121844, -0.7193168699343232), (-0.47427040382311514, 0.9917927698797322), (-0.6108898909194023, -0.6687191611189979), (-0.6982020458897136, -0.8763938268391511), (-0.3092578308755615, -0.2519541667586441), (-0.3499375291117751, -0.6195666253719473), (-0.2535709879335183, -1.0692107224806284), (-0.4887165287970724, -1.1632650719567652), (-0.2685908819582838, 0.35678229877540196), (-0.5526631499308791, 0.22617160025909974), (-0.3459499599810152, -1.2950939236744896), (-0.5523312034671027, -0.8181890723074268), (-0.4678081920444153, -0.9234392366954471), (-0.349493097288069, -0.47614331359970496), (-0.8721183168274367, -0.577084696882933), (-0.6765878322991634, -0.34640072040580716), (-0.6534684884961107, -1.4432987737137808), (-0.28135122786638234, -0.8142287947583506), (-0.4861087558845215, -0.8364116376829109), (-0.48112779260026306, -0.4577369017754091), (-0.6868652889841219, -1.1625005755012408), (-0.40022374926653603, -0.4674284211451523), (-0.4056271681786808, -1.2372149786322035), (-0.5298318688684691, -0.36611466314230556), (-0.6184901473073519, -0.9974823705410627), (-0.2684863501186374, -0.12623859276567423), (-0.5664083054063024, -0.30043965785450555), (-0.4094148828920177, -1.8708181682686522), (-0.4050734320174164, -0.44822995500885054), (-0.5422758450485606, -0.0892231776241386), (-0.361326257052623, -0.7970032807788628), (-0.509935912775715, -0.6651624051309827), (-0.41223702581677923, -0.6247249197400079), (-0.6236803640614207, -1.48964026212942), (-0.7452307236963558, -0.48519428475232185), (-0.8605566943090907, 0.6075791793846784), (-0.42774772292036445, -0.9886050497146024), (-0.7373668660654799, -0.8398880662107565), (-0.6118160689121466, -0.6979715647370598), (-0.2714184874682758, -0.0883165090729413), (-0.6355907632351265, -0.6123675229783002), (-0.21864728938714734, -1.2740083986087205), (-0.8235073253646759, 0.2138950431765595), (-0.22162501073131652, -0.8847603793379137), (-0.8191493406874679, -1.414027325941435), (-0.26212084220136794, -1.4940902140992756), (-0.49577608219379143, -0.40008885355336543), (-0.4192865068327664, -0.9552824150864229), (-0.45640507161030613, -0.3338590558985008), (-0.4245702820004429, -0.2898107010165678), (-0.2775233258447296, 0.5258306847862289), (-0.3835468533009762, -0.6913167363812527), (-0.5645928034094881, -0.3255346485323865), (-0.6647379851789004, -1.5299262793645108), (-0.6207822062476409, -0.8740977715967517), (-0.30192809804451737, -0.7063562927302551), (-0.4469739632968954, -1.3670861722905303), (-0.8641404788315636, -1.3377389581657972), (-0.5718214935366535, -0.3192206438613823), (-0.8358763735141269, -1.287290488154998), (-0.4078940759608751, 0.2922854973048527), (-0.6751365003818124, 0.0002410831877397686), (-0.8515162412629098, -0.6998406433822193), (-0.5182628181642336, 0.4498564276532233), (-0.6420380911065193, -0.6397460276518041), (-0.40254265804384437, -0.5101838985172331), (-0.22989144009761558, -1.3042796207203693), (-0.3547840129408048, -1.081839352115231), (-0.5648548970159325, -1.354249228172302), (-0.4985082535740029, -1.616568993005075), (-0.4911210827413786, -1.5886373047557523), (-0.8583682967631092, -0.7643699616526896), (-0.6207523385985942, -1.1594596010953664), (-0.5596989183833916, -0.5940917965179738), (-0.5046551901901338, -1.009886904048091), (-0.6203972417178586, -0.4995942519218757), (-0.42936958997402047, -0.8978190525442465), (-0.3417201855999099, -1.045708190225805), (-0.646146021558751, 0.26470557405780964), (-0.4174643035714669, -0.10796030615419984), (-0.6809238488841922, -1.674163101243132), (-0.496809928358687, -1.0362129762606935), (-0.7089782980369446, -0.6956213595347811), (-0.6686905062061169, 0.5535003687262096), (-0.20805980259557894, -1.4192116295082806), (-0.9627331482118169, 0.6096459697117022), (-0.6132137303674016, -0.7008836355661099), (-0.5831988767825013, -0.7200275269102813), (-0.37024548569738575, -0.3880444190413367), (-0.2434257589485352, 0.8551488493205953), (-0.6849666631088737, -0.2433697996682045), (-0.18624791540718463, -0.42164311481207484), (-0.6902644949148957, -0.15522169198711808), (-0.6496533013343957, -1.2251363702844487), (-0.5865331447360236, -0.6510750652654692), (-0.33258495031806856, -0.7487248749822931), (-0.46910910527997973, -0.8475541655922356), (-0.5642213579223081, 0.0452584242955763), (-0.5606647126757452, -0.9266085654929649), (-0.42818100881718185, 1.0002187889832903), (-0.5924016359430054, 0.44897075130663244), (-0.9670664981003628, -0.6363244618787335), (-0.2197605880582938, -0.5857642295592449), (-0.6585386039664984, -1.2842829634655402), (-0.8090177558910254, -0.10503083079127877), (-0.4411045721895196, -0.4015078247187351), (-0.6387192776573483, -1.4470872846240819), (-0.3543308421428041, -0.03905616429217354), (-0.7102403381691552, -1.2091164386349622), (-0.1602071347048811, -0.6565948227621369), (-0.4348810260525497, -0.44065569100270535), (-0.32085529129091206, -0.15250253034513672), (-0.5163543469047557, 0.5556850531564654), (-0.6307754752564654, -1.3043111587066172), (-0.47801781454665476, -1.3821243563582062), (-0.10491695078673458, 0.3514756298615772), (-0.6272017552510994, 0.06163219488430122), (-0.48766002168146094, -0.9361653888632547), (-0.5571821504366402, -1.254547316513251), (-0.560206434540484, 0.45698343753205134), (-0.4412579938870098, -0.8256829246222109), (-0.5987181830400906, -0.9987443718285398), (-0.5974492638825121, 0.24980194841193423), (-0.5346764987419097, -1.3376181448482403), (-0.5403681481386647, -1.4678180077869922), (-0.6222239057474317, -1.1204496882788106), (-0.6427988045774633, -0.9352376214423623), (-0.5339232478886147, -0.5899627096409591), (-0.738108081581089, -1.24054031769177), (-0.45533215313076836, -0.3392293390131546), (-0.6328120016355414, 0.3398427863407579), (-0.4122306299227314, -0.6644352995357333), (-0.7372228052108697, -1.3454269487085186), (-0.5037648365893509, -0.1617923308426425), (-0.46914367704153925, 0.41240338838057333), (-0.6368521754599048, -1.4702320241228293), (-0.7576573226574285, -0.6341068258293349), (-0.3243670999082948, -0.6110403297195546), (-0.42372680421174946, -0.48143758027009553), (-0.6537181844770756, -2.0216945966500566), (-0.751675474576952, -1.1719124020249705), (-0.8228706640620707, 0.2541567230799794), (-0.33747856690993694, -0.7180341482194529), (-0.33651732187329414, 0.30418940505540304), (-0.7445357789073823, -0.6377466647984205), (-0.45986288497372496, 0.27707703404552375), (-0.6841626140338541, -0.2362019519077632), (-0.6937649666448107, -0.7186840040213792), (-0.5984186702609307, -0.5834585930761005), (-0.7993781203070868, -0.9678176385959761), (-0.6391140346417795, -0.29073164475172514), (-0.6113912146530991, 0.7363448790687293), (-0.44169652767902395, 0.010375649312152029), (-0.2032180365866737, -1.3749000260611337), (-0.4776759776771193, 0.05922538092389424), (-0.35181287272758865, -0.2835765796155314), (-0.6676729041179033, 0.01869595605321269), (-0.3304707323705244, -1.1258502023483732), (-0.16964535437221823, -1.010383212893113), (-0.4333537541050523, -1.0617198886709975), (-0.4215935361302526, 0.060196488042837025), (-0.657156518713127, -0.2780097902445762), (-0.41157299453395263, -1.0002211200712026), (-0.4912870749995826, -0.6949972030610901), (-0.4924171489801943, -1.5819119385678733), (-0.8995723498548686, 0.18796378480620282), (-0.391580013696861, -0.15611195801390737), (-0.26538403564690743, 0.383499877157425), (-0.5001577329871822, -0.9957500267553905), (-0.2973028085278587, -0.22011813008288925), (-0.3736937476099017, 0.5946133130598302), (-0.09865307773431317, -0.21382745196848404), (-0.7359127934850082, -1.0055127923054556), (-0.5346638504447204, -0.7737299477203853), (-0.3827431613336723, -1.1968099333086757), (-0.6934216263028997, -1.0161445263395839), (-0.6489604729386299, -1.280498362357694), (-0.47472662702336504, -1.0030321522280277), (-0.44377670598469976, 1.123098999991527), (-0.5394739914468739, -0.552161475856468), (-1.1040352485546632, -1.0297387159559928), (-0.5831594747954143, -0.7938327267575971), (-0.4942234876800147, -0.08502608875855819), (-0.6777868990212882, -0.48338661340196837), (-0.8337419208693985, 0.45922173662020943), (-0.6249613179548728, -0.29872374211621744), (-0.1763406101690359, -0.8379399483711767), (-0.5032724952993555, -0.25688660288218446), (-0.772868748222834, -1.8836490511983424), (-0.4325679500576786, -1.0000254622864913), (-0.50196949176839, -1.4983200947897992), (-0.3208794163756789, -0.22537566191510444), (-0.5925518129440601, 0.3509847961771804), (-0.3568588032452354, -0.8595246256873358), (-0.5336015640115404, 0.9627675176020755), (-0.40692042415615254, -1.1862046158923234), (-0.5535808876667728, -0.33800598142684446), (-0.3527308811144411, -1.5390343142451974), (-0.6018889620504639, -0.1424920929383735), (-0.5920821991148875, 0.10699058713510408), (-0.3354129652692677, 0.7183155907654869), (-0.34995146877989936, 0.186930256765167), (-0.4347836660134682, -0.7042548934823604), (-0.7140677714627663, -0.12344148944408394), (-0.6538991438226157, 1.8538390674939615), (-0.2712252329782204, -0.3541563662427166), (-0.5699345201241864, -0.9591684741817422), (-0.8383435505814466, -1.0013354904922824), (-0.7664632388644653, -0.718709378107676), (-0.4917811198406994, 0.481491048645968), (-0.4028265313133898, -0.19914452003864036), (-0.6758892759778143, -1.5105840892512623), (-0.7102505888684335, -1.5411349696377925), (-0.5518221423102447, -0.47944388935623566), (-0.7213967903160619, -2.1598919851755496), (-0.29926580248790996, 0.42090621422957797), (-0.3234293741325721, 0.26533441838447064), (-0.7374707324629309, -1.024282070243864), (-0.5063534760705111, 0.8916570484343573), (-0.42109119536945994, -0.326284688859818), (-0.4409584804904385, -0.5416817725749062), (-0.4764739603902373, -1.5099350670986655), (-0.7280616523040145, -0.8538227112103847), (-0.9051311329672076, -1.1409122732741133), (-0.2640507858522307, 0.05150714701746631), (-0.6327134438467403, -0.5204812946503705), (-0.4355995656777733, 0.9401525160531032), (-0.4371270822280175, -0.9116014020223384), (-0.20919940410508386, -0.5191398599486202), (-0.6272775192648515, -0.6528497708467423), (-0.4434729095107942, -0.4143731110205963), (-0.36989139063896836, 0.3562729302680033), (-0.3035817126122128, 0.1583719511879076), (-0.45806854023913474, -0.7193553550622473), (-0.48204328444976996, -0.5169487286003468), (-0.3970607730830368, 0.07581469742676826), (-0.01711795234715363, -1.3094237440846783), (-0.5837106437828782, -1.059158776915018), (-0.38520346566094815, -0.28461895053364683), (-0.37936233849447776, -0.6015535703373586), (-0.6106495973554983, -1.361017865067468), (-0.4693512502178626, -0.7423378465382783), (-0.5355387392708887, -0.8130745074729919), (-0.767109375630038, -0.8019679760398979), (-0.5813846257800809, -0.5762659196946017), (-0.6555841861947622, -0.5566287930350912), (-0.5821109281515101, -1.4255093825377338), (-0.9321059374411025, -1.0233835558375457), (-0.2810178161432486, -0.08399006314818577), (-0.5807923846721076, -0.6247701802039975), (-0.5885429020487215, -1.2706855656227454), (-0.6454743247184105, -1.1628750101683543), (-0.6264871893232207, -0.1990439457854275), (-0.5826802871282455, -0.9801984932406754), (-0.4661499051091683, -0.437361892171221), (-0.42908746140228265, -0.46088684892901954), (-0.405178579785476, -0.8527744843460983), (-0.8508093639654589, -0.705975500579977), (-0.3638027103839344, 2.1579886649585323), (-0.5438117028416788, -1.0940525976323405), (-0.31053460099915514, -0.01604730591525542), (-0.37660590294795, -0.04539875926148721), (-0.47537979054525126, -1.0781024245222446), (-0.29381918573058696, -0.8294053150383222), (-0.04073715848911558, 0.38943139295148993), (-0.5440556810575488, -1.2202153343518003), (-0.2692637947476326, -0.4458450866970947), (-0.5775641027175081, -1.3126651557851632), (-0.6072805344073785, -0.8642790522152002), (-0.48850368209618744, -0.748887674942426), (-0.3219298634976054, 0.6945667379721313), (-0.560970274672925, -1.116299917121023), (-0.22718488660311487, -0.45749273505742294), (-0.3520276837348765, -1.1615499563068445), (-0.6300548789121764, 0.8096123644891409), (-0.24138724686715868, 0.44132024234992545), (-0.3859999654752, -0.4258887013080374), (-0.30327281581602117, 0.40992827403457555), (-0.47571710892774843, -0.4749291890003076), (-0.5001719229263133, 0.5027801888154703), (-0.34253281964007476, -0.5683094679034384), (-0.4481188213737644, -0.27194803741491974), (-0.46569857875460585, 0.24970564361567116), (-0.7021537968074547, 0.6118098559481917), (-0.039238468999040904, 0.10783660132804186), (-0.3801207456646841, -0.11456795898944568), (-0.6795314662713173, -0.7417011168378287), (-0.5076221097995096, -1.2325529331782437), (-0.41637676812656416, -0.4559684112092697), (-0.7383496805139865, -1.1018850934053699), (-0.4092231634539566, -0.5627780872617342), (-0.8068488184428554, -0.7843070808160786), (-0.7848844727487363, -0.5007427193656301), (-0.7088919839270571, 0.13870215749961082), (-0.4470178827733224, -0.39471117264685623), (-0.653844741805574, -0.9101725427547612), (-0.5557729391225326, 0.9501772100201813), (-0.5105002232045035, 1.233389429493417), (-0.6230870172960865, -0.562885905188684), (-0.31354082326876953, 0.20582047729909816), (-0.4072184407540713, -0.8287203209235551), (-0.585260755261033, -1.252224313770796), (-0.6315354423415265, -0.48706356889673497), (-0.3657059743277732, 1.3932951666148592), (-0.5990271055422426, -0.1308827569511999), (-0.7377919319454855, -2.4224286314968193), (-0.5756478178556494, -0.771964359155239), (-0.3149386791508106, -0.17306732039705938), (-0.6407229213023521, -0.26448549279840067), (-0.301676941680049, -0.08058132164173491), (-0.4432700462358979, -0.06843393249254427), (-0.5149337834685348, -0.15587277185742987), (-0.47027053155279985, -1.430719742756561), (-0.30086751560104624, -0.7059079350158342), (-0.6704857611703446, -0.28962563488082177), (-0.3675280012995097, -0.16702714221010043), (-0.5470090475060844, -0.3369765278994551), (-0.4632662424552992, -1.3443057169729937), (-0.6790092963132537, -0.7871422635639713), (-0.7885978210556159, -0.8429217392453192), (-0.6316881714193455, -1.183656733703535), (-0.642144486022765, 0.09267926556337235), (-0.5717196145640012, 0.8949622666601245), (-0.4523509792178371, -0.4868272064204376), (-0.5662529479830705, 0.08607696094126538), (-0.5779152344583003, -0.5781851821832311), (-1.0047492083348668, -0.5390127951405871), (-0.7556861009680941, -2.1147013249373274), (-0.2477218474364437, 0.08550152632645394), (-0.5181606326058159, -1.0778186948131852), (-0.3804088524072674, 0.7143972060222727), (-0.5749584659187948, 0.0473545890947735), (-0.6916334421210475, -0.0536472470157316), (-0.5960849747962396, -0.8247975356186527), (-0.3933067159138024, 1.923265931674249), (-0.7615396149073876, -1.602737503099561), (-0.3396835841318189, -1.1154630399396055), (-0.5481348684798251, -1.0052249660599506), (-0.5113643115559916, -1.1341561667816005), (-0.32525102772145975, -0.13981217166455592), (-0.543741923801726, -0.9305687263972368), (-0.7905662580622768, -1.0473521775239287), (-0.8669668393381038, -0.1461367884789579), (-0.23124946769948146, -0.0999232483683243), (-0.4839561363883447, -0.05235467090685728), (-0.22254255166928616, 0.4638759453335276), (-0.32369920761554616, 0.6067847409000244), (-1.0634980811455312, -1.3383776772678684), (-0.8982855519778339, -0.7304108329190628), (-0.47259688139402567, -1.534181613729038), (-0.4037050793461072, 0.2301363016146064), (-0.6273959943495758, -0.7614551179146878), (-0.37707091318603164, -0.571136282484124), (-0.5173742174356577, -0.7172085327349962), (-0.23015564445388767, -0.5950501255967087), (-0.3524070917336627, 0.30147826112013043), (-0.15975337247958699, -0.39842403063529336), (-0.41207750086762496, -0.25215014816886583), (-0.418351550475553, -0.8482564086858421), (-0.35389588764106983, 0.14431250355682265), (-0.36088800125995935, 0.6593191003733541), (-0.40179000922959096, 0.19225344751518672), (-0.4123110260720924, 0.13608412740278625), (-0.5560761058084487, -1.5772558546328155), (-0.8622407384360379, -0.6580895510443058), (-0.6977155681504783, 0.8082188179383278), (-0.2698308188145349, -1.3584151110450164), (-0.3404505223408403, -1.925362016535322), (-0.7291417641886865, -1.4444742101599974), (-0.5821413562517097, -0.13556428349136854), (-0.33160699188825915, 0.4267373887354223), (-0.42858637046179027, -1.9230616858570837), (-0.3688576235025005, -0.08708583613790777), (-0.5606791976601945, -0.009280777541386469), (-0.23801432314778598, -0.8802539246934322), (-0.5221676613911066, -0.6443577408987069), (-0.6244241770846828, -1.1524678767812737), (-0.40168126893691797, 0.5312079778314439), (-0.5194026188148357, -0.5656968612751684), (-0.40514941032299245, 0.368437922921168), (-0.44064198277601735, -0.6748296795028135), (-0.5497129372880977, 0.2994861434005172), (-0.26847634778895046, -0.9341614640703494), (-0.712806509585354, -0.9753069898570949), (-0.473328904704401, -0.4936789692078634), (-0.3707059389663432, 0.13755796157303965), (-0.328853788659089, -1.022760825430568), (-0.4101383209513758, -1.330224212654487), (-0.4835382404606793, -0.2290670456615032), (-0.43178503554656217, -1.0603308878634161), (-0.133046635591951, -1.0827547105953657), (-0.3683399368294313, -0.9345676313728531), (-0.21365414124402476, -0.06171582332443965), (-0.3789708740776193, -0.6607960094878539), (-0.22203213871972544, -0.7210924173237738), (-0.613824340946743, -1.1156052529879326), (-0.5492240096386263, -0.14375333535168633), (-0.654631555902618, 0.8322899815290051), (-0.47412734541295687, -0.07459998170250547), (-0.5442014658756369, -0.7947818700791046), (-0.6484415925481972, 0.6844871146188133), (-0.6612441880721011, 0.05497251948367186), (-0.3897531562872424, -0.09294620973082596), (-0.5161581206879168, -1.0237857277461853), (-0.2833804110206528, -0.7797121978985782), (-0.33444833675205926, 0.15327747881684822), (-0.5652888057514566, -2.07827506138968), (-0.6583764971549815, -0.9323205903946091), (-0.7116323807030527, -1.2473831990808826), (-0.4473581052828197, -1.1662821173315068), (-0.7458461834545131, -0.820724881736056), (-0.481267003711263, -1.054278226528838), (-0.4777278284918218, -1.8120168795227254), (-0.4349324391508066, 0.7164090031510116), (-0.5469280377222899, 0.3506904900545583), (-0.4586629820216015, -0.3269450910691189), (-0.47467411059193293, -1.275055352208516), (-0.222369548892145, -0.06911622062497837), (-0.5578381049718866, -0.38594624813773615), (-2.210084285001793, 1.1771322662596462), (-0.6594244596830621, -0.280990926347995), (-0.6662171640976952, 0.0008417356726644375), (-0.22854622107234873, -0.041100318882224075), (-0.26704121256063856, -0.841051003928782), (-0.7900975523705576, -0.7657592282899018), (-0.5477881777157942, -1.3341181539078222), (-0.29378357385411213, 0.26526324058228445), (-0.24590759102453263, -0.6296037663417733), (-0.4084944414293984, 1.082416336600589), (-0.4589234545145083, -0.05093652020089089), (-0.3634287426137923, -1.6956429994392295), (-0.6797413780914587, -1.0479253863698657), (-0.7714501660733879, -1.2230169971331746), (-0.4225698461937685, 0.25900109774788815), (-0.4644245776918982, -1.624606302820365), (-0.3553407832437772, -1.1338483475836953), (-0.43161942611562504, -0.8505393573869285), (-0.5057679520411354, -0.6651592390662479), (-0.5526393862229408, 0.05422836418159943), (-0.29376520629557235, -0.19713355091653553), (-0.5405359894330669, 0.1617504415829674), (-0.605838662306445, -1.3171761308441636), (-0.38624060410610683, -1.0419243135505405), (-0.2944752622027284, 1.0925930732801523), (-0.26416417143814963, -0.7451324488942241), (-0.6618548931756008, -1.3777984076497143), (-0.7144669670728173, -0.371055637830131), (-0.716052367407266, -0.6657692245651866), (-0.28559587707859785, -0.019355077917080247), (-0.5667151994875455, -1.9005823040848484), (-0.8867921614991718, -0.8124193278485139), (-0.3798447228664967, 0.4786889389793039), (-0.43968638804689847, -1.3382541068756357), (-0.23601012960371867, 0.06310410153329948), (-0.33879320194316787, -1.466148538353603), (-0.26715420905125514, -0.006254863362405468), (-0.3665678312802967, 0.21735114691985216), (-0.6117848028541717, -2.411268232324529), (-0.3977133094473192, -0.24326354676920017), (-0.7555769592490197, 0.3923787418309608), (-0.24976125013567915, 0.25037182464587826), (-0.7339691657250283, -0.9817331638286944), (-0.7919594898579105, 1.026318668500735), (-0.7311134062423568, -0.5250701480474274), (-0.36944057114987183, -0.3984919882636991), (-0.4271397432440469, 0.07912604564880386), (-0.7508670333793391, -1.2367545742328823), (-0.3831330999496309, -0.35666535780552083), (-0.33042172711673334, -0.6648205693630284), (-0.28277205310955866, -0.3184368731606217), (-0.5683931196544068, -1.0480732578409475), (-0.6771941093313858, -0.639392031859682), (-0.41706266659544844, -0.5629520258312263), (-0.8656205098733483, -0.9079117202779706), (-0.1943941159350552, -0.524252747596398), (-0.5494747222451543, -1.350940320588258), (-0.38527677598785737, -0.021824261090534418), (-0.34756971851056406, -1.8671632495126051), (-0.44846550511866273, 1.1187327608679887), (-0.3659026215665945, 0.8298087185676951), (-0.5654464777001178, -0.6639697364772794), (-0.5469436410391081, 0.5830646818605729), (-0.5767103785171435, 0.42818515637263416), (-0.30770485515220647, -0.6751539104332287), (-0.6983807976094197, -0.8834624672529744), (-0.401553110048984, -0.35712140456408814), (-0.6032128647996691, 0.24867065102343489), (-0.25677514116489114, -0.7350462772064773), (-0.5442231524756105, -1.4907773585830049), (-0.4746288617203954, -0.6918616191393563), (-0.623053653337812, -1.3905490783731107), (-0.6334383666281931, -0.8211248427049752), (-0.3312994026449375, -1.0086123163511242), (-0.6359588653879116, -1.3280963775022012), (-0.3133267532739442, 0.1764076969385511), (-0.47690005947746483, -0.11210408975685991), (-0.5280155930952738, -0.24831806453397343), (-0.6619647909971405, -0.28472345457223774), (-0.6119028341636205, -0.7372727435814065), (-0.396179440728441, -0.7452868086695974), (-0.628468375212225, -0.09378928125667085), (-0.5264526830275734, -0.5922206688176517), (-0.39826493238274197, -0.5127938125526006), (-0.1614304828233657, 0.48360979569756174), (-0.6617411614931147, -1.2689534176200647), (-0.5080197192750963, 0.01761406121557677), (-0.293960924252779, -1.460859714005335), (-0.6072940566132522, -0.2982626324184023), (-0.35973001307007724, 0.27130398378988224), (-0.5290774640099303, -0.8295534449078077), (-0.5068414814763287, 0.5022053281032337), (-0.4870318591358082, -0.12264332864272406), (-0.6139549798808726, -0.5549830975050085), (-1.0508080612346704, -1.741309343379221), (-0.4879421190631934, 0.4427093313593366), (-0.6486946973431097, -0.3855574560218377), (-0.832299229991349, -1.5159798251959193), (-0.6914472251672243, -1.1987681609705902), (-0.332019606662929, -0.12240126477506717), (-0.29435849694482175, -0.717504458039472), (-0.5402377884708984, 0.1489542621167094), (-0.3586018373314229, 0.32633895701877647), (-0.37455317795385235, -1.577415989934447), (-0.7117027573971065, -0.7277737394494735)]
from sklearn.tree import DecisionTreeRegressor
model_tree = DecisionTreeRegressor(random_state=42, max_depth=6)
model_tree.fit(X_train, y_train)
y_hat = model_tree.predict(X_test)
dot = (y_hat - y_test)
dot@dot
4853.028343114377
model_tree = DecisionTreeRegressor(random_state=42, max_depth=2)
model_tree.fit(rock_X_train, rock_y_train)
y_hat = model_tree.predict(rock_X_test)
dot = (y_hat - rock_y_test)
dot@dot
334.60454642782105
from sklearn.ensemble import RandomForestRegressor
model_rf = RandomForestRegressor(random_state=42, verbose=1, n_jobs=4, n_estimators=50)
model_rf.fit(X_train, y_train)
y_hat = model_rf.predict(X_test)
dot = (y_hat - y_test)
dot@dot
[Parallel(n_jobs=4)]: Using backend ThreadingBackend with 4 concurrent workers. [Parallel(n_jobs=4)]: Done 42 tasks | elapsed: 2.1min [Parallel(n_jobs=4)]: Done 50 out of 50 | elapsed: 2.4min finished [Parallel(n_jobs=4)]: Using backend ThreadingBackend with 4 concurrent workers. [Parallel(n_jobs=4)]: Done 42 tasks | elapsed: 0.1s [Parallel(n_jobs=4)]: Done 50 out of 50 | elapsed: 0.1s finished
4542.99075474934
model_rf = RandomForestRegressor(random_state=42, verbose=1, n_jobs=4, n_estimators=50)
model_rf.fit(rock_X_train, rock_y_train)
y_hat = model_rf.predict(rock_X_test)
dot = (y_hat - rock_y_test)
dot@dot
[Parallel(n_jobs=4)]: Using backend ThreadingBackend with 4 concurrent workers. [Parallel(n_jobs=4)]: Done 42 tasks | elapsed: 11.0s [Parallel(n_jobs=4)]: Done 50 out of 50 | elapsed: 13.0s finished [Parallel(n_jobs=4)]: Using backend ThreadingBackend with 4 concurrent workers. [Parallel(n_jobs=4)]: Done 42 tasks | elapsed: 0.0s [Parallel(n_jobs=4)]: Done 50 out of 50 | elapsed: 0.0s finished
308.1962896787388
The survey is going to be run through CitizenScience Zurich platform: https://lab.citizenscience.ch/en/project/339/
To perform the survey, we selected, for each genre, the most significant (in our opinion) music pieces:
We hope that such a composition of samples might help us answer both the question about our regression model performance, and the relevance of views numbers for the music popularity.
To limit the number of samples which we present to the users to the minimum, we selected only 8 genres, which the dataset authors claim to be the most complete and balanced. This resulted in creation of 32 survey samples in total.
min_max_scaler_q = preprocessing.MinMaxScaler()
feature_flatten_normalized = min_max_scaler_q.fit_transform(features_flatten)
feature_flatten_normalized = pd.DataFrame(feature_flatten_normalized)
feature_flatten_normalized.columns = features_flatten.columns
feature_flatten_normalized.index = features_flatten.index
feature_flatten_normalized.head()
chroma_cens_kurtosis_01 | chroma_cens_kurtosis_02 | chroma_cens_kurtosis_03 | chroma_cens_kurtosis_04 | chroma_cens_kurtosis_05 | chroma_cens_kurtosis_06 | chroma_cens_kurtosis_07 | chroma_cens_kurtosis_08 | chroma_cens_kurtosis_09 | chroma_cens_kurtosis_10 | ... | tonnetz_std_04 | tonnetz_std_05 | tonnetz_std_06 | zcr_kurtosis_01 | zcr_max_01 | zcr_mean_01 | zcr_median_01 | zcr_min_01 | zcr_skew_01 | zcr_std_01 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
track_id | |||||||||||||||||||||
2 | 0.001033 | 0.002501 | 0.001246 | 0.001021 | 0.001067 | 0.000873 | 0.000798 | 0.001470 | 0.003169 | 0.002444 | ... | 0.148243 | 0.130829 | 0.101443 | 0.000915 | 0.454097 | 0.097468 | 0.079564 | 0.000000 | 0.189929 | 0.143937 |
3 | 0.000435 | 0.000941 | 0.001300 | 0.001243 | 0.001121 | 0.000708 | 0.000771 | 0.001047 | 0.002547 | 0.002038 | ... | 0.177944 | 0.152235 | 0.150073 | 0.000568 | 0.461007 | 0.096271 | 0.071390 | 0.000000 | 0.185952 | 0.162532 |
5 | 0.000281 | 0.000649 | 0.000948 | 0.000865 | 0.001210 | 0.000997 | 0.000236 | 0.000535 | 0.003005 | 0.002274 | ... | 0.107253 | 0.135841 | 0.124322 | 0.001040 | 0.368707 | 0.060433 | 0.046322 | 0.000000 | 0.191032 | 0.104805 |
10 | 0.000640 | 0.000574 | 0.002340 | 0.000649 | 0.001031 | 0.001039 | 0.000865 | 0.001189 | 0.002886 | 0.002806 | ... | 0.210157 | 0.192545 | 0.117085 | 0.002773 | 0.446693 | 0.088225 | 0.080109 | 0.000000 | 0.205412 | 0.095226 |
20 | 0.000199 | 0.000606 | 0.001219 | 0.000765 | 0.000845 | 0.000714 | 0.000384 | 0.000564 | 0.002495 | 0.001236 | ... | 0.273332 | 0.241482 | 0.181304 | 0.002208 | 0.464462 | 0.053726 | 0.044687 | 0.003724 | 0.201654 | 0.072089 |
5 rows × 518 columns
features_genre_relevant = feature_flatten_normalized[tracks["track"]['genre_top'] == 'Pop'][df_merged.index[df_merged.index != 'const']]
mult = features_genre_relevant * df_merged['Pop']
pop_score_pred = mult.sum(axis=1)
pop_score_pred[~no_album][tracks["track"]['genre_top'] == 'Pop']
track_id 10 -5.404185 213 -5.742983 821 -5.629483 822 -5.652588 823 -5.532414 ... 154410 -5.358343 154411 -5.563459 154412 -5.206557 154413 -5.894569 154414 -6.035722 Length: 2321, dtype: float64
def n_samples_genre(n, genre, high_score_threshold, low_score_threshold):
print('Genre : ', genre)
features_genre_relevant = feature_flatten_normalized[tracks["track"]['genre_top'] == genre][df_merged.index[df_merged.index != 'const']]
mult = features_genre_relevant * df_merged[genre]
pop_score_pred = mult.sum(axis=1)
# The ones that are popular/unpopular based on the dataset
pop_df_dataset = pop_df["pop_score"][tracks["track"]['genre_top'] == genre][pop_df["pop_score"][tracks["track"]['genre_top'] == genre] > 0]
unpop_df_dataset = pop_df["pop_score"][tracks["track"]['genre_top'] == genre][pop_df["pop_score"][tracks["track"]['genre_top'] == genre] < 0]
# The ones that are popular/unpopular based on the regression
genre_df_pred = pop_score_pred[~no_album][tracks["track"]['genre_top'] == genre]
low, high = genre_df_pred.quantile([0.25,0.75])
pop_df_pred = genre_df_pred.loc[(genre_df_pred > high)]
unpop_df_pred = genre_df_pred.loc[(genre_df_pred < low)]
# True Positive
tp = pop_df_dataset.index.intersection(pop_df_pred.index)
tp = pop_df["pop_score"][tp].nlargest(1)
print('TP')
print(tp)
# False Positive
fp = unpop_df_dataset.index.intersection(pop_df_pred.index)
fp = pop_df["pop_score"][fp].nsmallest(1)
print('FP')
print(fp)
# True Negative
tn = unpop_df_dataset.index.intersection(unpop_df_pred.index)
tn = pop_df["pop_score"][tn].nsmallest(1)
print('TN')
print(tn)
# False Negative
fn = pop_df_dataset.index.intersection(unpop_df_pred.index)
fn = pop_df["pop_score"][fn].nlargest(1)
print('FN')
print(fn)
print("===============")
return [tp, fp, tn, fn]
samples = []
#genres_survey = tracks["track"]['genre_top'].value_counts().index
genres_survey = ['Electronic', 'Experimental', 'Folk', 'Hip-Hop', 'Instrumental', 'International', 'Pop', 'Rock']
for genre in genres_survey:
samples.extend(n_samples_genre(1, genre, 0, 0))
Genre : Electronic TP track_id 69170 4.443784 Name: pop_score, dtype: float64 FP track_id 82185 -2.565108 Name: pop_score, dtype: float64 TN track_id 1432 -2.92627 Name: pop_score, dtype: float64 FN track_id 28553 3.518097 Name: pop_score, dtype: float64 =============== Genre : Experimental TP track_id 7391 2.683178 Name: pop_score, dtype: float64 FP track_id 155010 -2.841336 Name: pop_score, dtype: float64 TN track_id 42956 -3.034765 Name: pop_score, dtype: float64 FN track_id 92296 2.858244 Name: pop_score, dtype: float64 =============== Genre : Folk TP track_id 50952 3.00961 Name: pop_score, dtype: float64 FP track_id 154623 -2.002789 Name: pop_score, dtype: float64 TN track_id 38439 -2.500617 Name: pop_score, dtype: float64 FN track_id 88863 1.738048 Name: pop_score, dtype: float64 =============== Genre : Hip-Hop TP track_id 24425 4.373379 Name: pop_score, dtype: float64 FP track_id 53996 -1.684636 Name: pop_score, dtype: float64 TN track_id 53298 -2.298028 Name: pop_score, dtype: float64 FN track_id 10699 2.444624 Name: pop_score, dtype: float64 =============== Genre : Instrumental TP track_id 61053 3.348277 Name: pop_score, dtype: float64 FP track_id 28819 -2.115002 Name: pop_score, dtype: float64 TN track_id 24360 -2.736748 Name: pop_score, dtype: float64 FN track_id 37111 3.091666 Name: pop_score, dtype: float64 =============== Genre : International TP track_id 51006 3.49511 Name: pop_score, dtype: float64 FP track_id 149993 -2.126809 Name: pop_score, dtype: float64 TN track_id 153476 -2.344313 Name: pop_score, dtype: float64 FN track_id 73760 3.088421 Name: pop_score, dtype: float64 =============== Genre : Pop TP track_id 62436 2.997317 Name: pop_score, dtype: float64 FP track_id 89489 -2.112164 Name: pop_score, dtype: float64 TN track_id 31720 -2.192325 Name: pop_score, dtype: float64 FN track_id 62450 3.002358 Name: pop_score, dtype: float64 =============== Genre : Rock TP track_id 85789 3.288215 Name: pop_score, dtype: float64 FP track_id 9520 -2.686833 Name: pop_score, dtype: float64 TN track_id 62979 -3.056946 Name: pop_score, dtype: float64 FN track_id 114396 2.514736 Name: pop_score, dtype: float64 ===============
import shutil
samples = [str(sample.index.values[0]).zfill(6) for sample in samples]
for sample in samples:
shutil.copyfile(f"{datasource}/fma_large/{sample[0:3]}/{sample}.mp3", f"{datasource}/survey_samples/{sample}.mp3")
For all of the 32 samples selected previously, participants are invited to answer to 3 questions after having listened to the corresponding 30s sample:
These questions are thought to reflect different dimensions of popularity: the first one is directed toward personal taste and could tell if one would listen casually to the song, the second one add another layer to characterize the piece's capacity to reach broad audiences, and the last one tries to consider one's perception of popularity beside their personal taste. This helps in answering our research question because we defined popularity as how much a song is listened to. With these questions, we will verify - or not - whether this assumptions matches ones's perception of popularity.
For each of these questions, answers are collected on a five-level Likert scale proposing the following options:
Note that in our understanding of popularity, a popular song should receive high scores for the two first question ("Definitely yes") but a low score for the last question ("Definitely no").
The survey is far from perfect and has some bias : the first two questions are focusing more on the 'why' a music becomes popular, and we only make suppositions. However, if we observe a correlation between actual popularity and positive answers, this would not necessarly mean that one is the cause of the other, and we should keep that in mind.
Only a couple of participants have taken part in this survey so far, hence no results can be drawn. However we already expose the main steps of the analysis pipeline that we intend to use to investigate the results obtained through the survey.
# path to the folders hosting informations about the survey
PATH_SURVEY_FOLDER = "./Musicology/"
# general informations
TASK_FOLDER_ZIP = "339_music_popularity_task_csv.zip"
# participant's answers
TASK_RUN_FOLDER_ZIP = "339_music_popularity_task_run_csv.zip"
# zip with general informations
zf_task = ZipFile(PATH_SURVEY_FOLDER+TASK_FOLDER_ZIP)
# zip with answers
zf_task_run = ZipFile(PATH_SURVEY_FOLDER+TASK_RUN_FOLDER_ZIP)
# in the code, we'll refer to the above mentionned questions with the following labels
QUESTIONS = ["like", "recommend", "radio"]
Surveys answers are identified by a task id corresponding to one of the selected samples. This id refers to the question on CSZ plateform, which is different from the track identifier. Thus, we recreate the mapping from the task id to the track id thanks to mp3 file associated with the different questions (hence, task id).
# relink tasks id (samples in the survey) to track ids from fma dataset
df_task = pd.read_csv(zf_task.open('music_popularity_task.csv'))
df_task = df_task[["id", "info_filename", "n_answers"]]
df_task["info_filename"] = df_task["info_filename"].apply(lambda filemp3: int(filemp3[:-4]))
df_task.rename({"info_filename":"track_id"}, axis=1, inplace=True)
# create a dict -> {task id: track id}
dict_task_to_track = pd.Series(df_task.track_id.values,index=df_task.id).to_dict()
The answers provided by the participant are then loaded and manipulated to make them more convenient to handle. Notably, the Likert scale is converted to a linear, equidistant, scale ranging from -2 for "Definitely no" to 2 for "Definitely yes".
df_survey = pd.read_csv(zf_task_run.open('music_popularity_task_run.csv'))
df_survey = df_survey[["id", "task_id", "user_id", "user_ip", "info_0", "info_1", "info_2"]]
# convert Likert scale to numerical values
convert_likert_scale = {"Definitely no":-2,
"No":-1,
"I don't know":0,
"Yes":1,
"Definitely yes":2
}
for col in df_survey.columns:
if 'info' in col:
df_survey[col] = df_survey[col].map(convert_likert_scale)
# rename column based on the questions asked
rename_columns = {"info_0":QUESTIONS[0], "info_1":QUESTIONS[1], "info_2":QUESTIONS[2]}
df_survey.rename(mapper=rename_columns, inplace=True, axis=1)
# retrieve track id (fma) from task id (CSZ)
df_survey["track_id"] = df_survey["task_id"].map(dict_task_to_track)
# create unique identifier for the user either based on IP or user id given by the platform and anonymize
user_identifiers = np.array(list(set(df_survey.user_id.dropna()))+list(set(df_survey.user_ip.dropna())), dtype=str)
user_anon = np.arange(1,len(user_identifiers)+1, dtype=str)
dict_anon = dict(zip(user_identifiers, user_anon))
df_survey["user"] = [str(uid) if str(uid) != "nan"
else str(uip)
for uid, uip in zip(df_survey.user_id, df_survey.user_ip)
]
df_survey["user"] = df_survey["user"].map(dict_anon)
df_survey.drop(["user_id", "user_ip"], axis=1, inplace=True)
df_survey.sample(5)
As survey bear intrinsic bias and answers are highly influenced by the respondent personality, we compute multi-level variable that rescale one participant answers around their mean value for each of the 3 questions.
users_means = df_survey.groupby("user").mean()
users_means
for q in QUESTIONS:
df_survey["rel_"+q] = [score-users_means.loc[user_id][q]
for (score, user_id) in zip(df_survey[q], df_survey["user"])
]
df_survey.sample(5)
The DF containing the surveys answers are then enriched with fma's informations as well as previously computed popularity features so that all following analysis can be solely based on this DataFrame.
samples_id = [sample.index[0] for sample in samples]
additional_info = pop_df.loc[samples_id].drop("date_created", axis=1)
# retrieve type of classification (TP,FP,TN,FN) from the way samples were generated
additional_info["type"] = ["tp" if (i%4)==0
else "fp" if (i%4)==1
else "tn" if (i%4)==2
else "fn"
for i in np.arange(len(additional_info))
]
# add these informations in the survey DF
df_survey = df_survey.merge(additional_info, on="track_id")
df_survey.sample(5)
First thing, we can assess the inter-rater agreement thanks to several methods developped along the second half of the XXth century that are pre-implemented in nltk
's metrics package. These first measures can bring informations regarding the shared (or not) behaviors or perception of the participants pool (eg. a value high agreement measure, close to 1, regarding the radio question could indicate that the perception of what kind of music pieces are played on the radio is shared amongst participants).
Metrics computed below are Cohen's $\kappa$ (Cohen, 1960), Fleiss' $\kappa$ (Fleiss, 1971) and Krippendorff's $\alpha$ (Krippendorff, 1989).
Both of these measures range from -1 to 1 and it is commonly accepted (see MiniTab online documentation) that $\alpha<0$ indicates more disagreement than what could be expected by chance, $\alpha=0$ agreement is the same that what could be expected by chance, and substantial agreement can be underlined above $\alpha=0.6$.
participants_list = list(set(df_survey.user))
# should be unique so sum is just the score given by the rater
survey_grouped_user_track = df_survey.groupby(["user", "track_id"]).sum()
def dist_labels(l1, l2):
""" define distance between labels (equidistant and symmetric) """
return np.abs(l1-l2)
for q in QUESTIONS+["rel_"+q for q in QUESTIONS]:
print("===============")
print("Feature : ", q)
taskdata = []
for i, participant in enumerate(participants_list):
# convert data to nltk required format
taskdata = taskdata+[[i
,str(samples_id[j])
,survey_grouped_user_track.loc[participant][q].loc[samples_id[j]]
]
for j in range(0,len(samples_id))
]
ratingtask = agreement.AnnotationTask(data=taskdata, distance=dist_labels)
print("Agreement measures:")
print("kappa : {0:.5g}".format(ratingtask.kappa()))
print("fleiss : {0:.5g}".format(ratingtask.multi_kappa()))
print("alpha : {0:.5g}".format(ratingtask.alpha()))
#print("scotts : {0:.5g}".format(ratingtask.pi()))
=============== Feature : like Agreement measures: kappa : -0.277 fleiss : -0.277 alpha : 0.30859 scotts : -0.34988 =============== Feature : recommend Agreement measures: kappa : -0.04474 fleiss : -0.04474 alpha : 0.41244 scotts : -0.076636 =============== Feature : radio Agreement measures: kappa : -0.44934 fleiss : -0.44934 alpha : 0.18294 scotts : -0.53342 =============== Feature : rel_like Agreement measures: kappa : 0.078125 fleiss : 0.078125 alpha : 0.39393 scotts : -0.058296 =============== Feature : rel_recommend Agreement measures: kappa : 0.080078 fleiss : 0.080078 alpha : 0.36065 scotts : -0.045505 =============== Feature : rel_radio Agreement measures: kappa : -0.1875 fleiss : -0.1875 alpha : 0.19036 scotts : -0.37324
Then, correlations are studied. We will look for possible significant (based on $p$ values) correlations between the average scores given by the participant for the different questions but also between these human features and the features used to define popularity (number of streams, likes, comments for song and album) and the popularity score itself.
def corr_sig(df):
""" return matrix of p values of df based on its correlation matrix """
p_matrix = np.zeros(shape=(df.shape[1],df.shape[1]))
for col in df.columns:
for col2 in df.drop(col,axis=1).columns:
_ , p = stats.pearsonr(df[col],df[col2])
p_matrix[df.columns.to_list().index(col),df.columns.to_list().index(col2)] = p
return p_matrix
df_survey_pop = df_survey[QUESTIONS
+["rel_"+q for q in QUESTIONS]
+POP_FEATURES
+["album_"+f for f in POP_FEATURES]
+["pop_score"]
]
corr_survey = df_survey_pop.corr()
p_values = corr_sig(df_survey_pop) # compute p-values
mask_sig = np.invert(np.tril(p_values<0.05)) # mask non significant correlations
fig, ax = plt.subplots(figsize=(15,12))
ax = sns.heatmap(corr_survey
#, mask=mask_sig # mask non significant values
, square=True
, annot=True
, fmt=".2f"
, vmin=-1
, vmax=1
, cmap="RdBu"
)
plt.show()
The differences between the four selected groups (true positive, false postivie, flase negative and true negative) can also be explored. This could highlight interesting insights on how human perceive the samples classified as popular and possibly on what kind of samples are misclassified. For example are True Positive cases given high liking and recommending scores and low surprises on their potentiality to be played on the radio by raters ? Also, are False Negative (Positive) cases rated by participants more similarly as True Positive or True Negative ?
for category in QUESTIONS+["rel_"+q for q in QUESTIONS]:
plt.figure(figsize=(10,5))
ax = sns.barplot(x="type", y=category, data=df_survey)
We will use the approach presented in the preprint paper: Haunschmid et al. (2020) "audioLIME: Listenable Explanations Using Source Separation"
This approach will allow us to generate "listenable" explainations of the music - which parts of the music piece increase and which decrease the assigned score. We used the original dataset creators code to compute the music parameters from raw data. The method is still being fine-tuned to get the best performance, therefore we present only the preliminary results (and listenable explainations) in our analysis.
# Feature computation code cloned from FMA repository (features.py)
def columns():
feature_sizes = dict(chroma_stft=12, chroma_cqt=12, chroma_cens=12,
tonnetz=6, mfcc=20, rmse=1, zcr=1,
spectral_centroid=1, spectral_bandwidth=1,
spectral_contrast=7, spectral_rolloff=1)
moments = ('mean', 'std', 'skew', 'kurtosis', 'median', 'min', 'max')
columns = []
for name, size in feature_sizes.items():
for moment in moments:
it = ((name, moment, '{:02d}'.format(i+1)) for i in range(size))
columns.extend(it)
names = ('feature', 'statistics', 'number')
columns = pd.MultiIndex.from_tuples(columns, names=names)
# More efficient to slice if indexes are sorted.
return columns.sort_values()
def compute_features(x):
features = pd.Series(index=columns(), dtype=np.float32)
def feature_stats(name, values):
features[name, 'mean'] = np.mean(values, axis=1)
features[name, 'std'] = np.std(values, axis=1)
features[name, 'skew'] = stats.skew(values, axis=1)
features[name, 'kurtosis'] = stats.kurtosis(values, axis=1)
features[name, 'median'] = np.median(values, axis=1)
features[name, 'min'] = np.min(values, axis=1)
features[name, 'max'] = np.max(values, axis=1)
sr = 44100
x = librosa.to_mono(x)
f = librosa.feature.zero_crossing_rate(x, frame_length=2048, hop_length=512)
feature_stats('zcr', f)
cqt = np.abs(librosa.cqt(x, sr=sr, hop_length=512, bins_per_octave=12,
n_bins=7*12, tuning=None))
assert cqt.shape[0] == 7 * 12
assert np.ceil(len(x)/512) <= cqt.shape[1] <= np.ceil(len(x)/512)+1
f = librosa.feature.chroma_cqt(C=cqt, n_chroma=12, n_octaves=7)
feature_stats('chroma_cqt', f)
f = librosa.feature.chroma_cens(C=cqt, n_chroma=12, n_octaves=7)
feature_stats('chroma_cens', f)
f = librosa.feature.tonnetz(chroma=f)
feature_stats('tonnetz', f)
del cqt
stft = np.abs(librosa.stft(x, n_fft=2048, hop_length=512))
assert stft.shape[0] == 1 + 2048 // 2
assert np.ceil(len(x)/512) <= stft.shape[1] <= np.ceil(len(x)/512)+1
del x
f = librosa.feature.chroma_stft(S=stft**2, n_chroma=12)
feature_stats('chroma_stft', f)
f = librosa.feature.rms(S=stft)
feature_stats('rmse', f)
f = librosa.feature.spectral_centroid(S=stft)
feature_stats('spectral_centroid', f)
f = librosa.feature.spectral_bandwidth(S=stft)
feature_stats('spectral_bandwidth', f)
f = librosa.feature.spectral_contrast(S=stft, n_bands=6)
feature_stats('spectral_contrast', f)
f = librosa.feature.spectral_rolloff(S=stft)
feature_stats('spectral_rolloff', f)
mel = librosa.feature.melspectrogram(sr=sr, S=stft**2)
del stft
f = librosa.feature.mfcc(S=librosa.power_to_db(mel), n_mfcc=20)
feature_stats('mfcc', f)
# except Exception as e:
# print('{}: {}'.format(tid, repr(e)))
return features
# p: Minimum deviation from dist mean which changes classification outcome
# Higher value gives better results but requires much more samples to be
# processed in order to find distortions which "flip" the class.
p = 0.25
def analysis_thread(x):
feat_comp = pd.DataFrame([compute_features(x)])
feat_comp.columns = ['_'.join(col) for col in feat_comp.columns.values]
feat_comp = feat_comp[selected_features].values[0]
feat_comp_scaled = min_max_scaler.transform(feat_comp.reshape(1, -1))
feat_comp_scaled = pd.DataFrame(feat_comp_scaled)
feat_comp_scaled.insert(0, 'const', 1.0)
score = results.predict(feat_comp_scaled).values[0]
#return np.array([score])
if score > p: # Popular
return np.array([0, 0, 1])
elif score < -p: # Unpopular
return np.array([1, 0, 0])
else: # Average
return np.array([0, 1, 0])
def predict_fn(xs):
res = pqdm(xs, analysis_thread, n_jobs=4)
pbar.update(xs.shape[0])
return np.array(res)
pbar = tqdm(total=2)
x_local, sr = torchaudio.load(f"{datasource}/fma_large/050/050952.mp3")
print(predict_fn(np.array([x_local.numpy(), x_local.numpy()])))
pbar.close()
HBox(children=(FloatProgress(value=0.0, max=2.0), HTML(value='')))
HBox(children=(FloatProgress(value=0.0, description='SUBMITTING | ', max=2.0, style=ProgressStyle(description_…
HBox(children=(FloatProgress(value=0.0, description='PROCESSING | ', max=2.0, style=ProgressStyle(description_…
HBox(children=(FloatProgress(value=0.0, description='COLLECTING | ', max=2.0, style=ProgressStyle(description_…
[[0 1 0] [0 1 0]]
# Fixing some code by forkin our own classes out of audioLIME code
def separate(separator, waveform, target_sr, spleeter_sr):
waveform = librosa.resample(waveform, target_sr, spleeter_sr)
waveform = np.expand_dims(waveform, axis=1)
prediction = separator.separate(waveform)
return prediction
class FMAAudioProvider(DataProvider):
def __init__(self, audio_path, target_sr=44100):
self.target_sr = target_sr
super().__init__(audio_path)
def initialize_mix(self):
waveform, _ = torchaudio.load(self._audio_path)
return waveform.mean(axis=0).numpy()
class FMAFactorization(DataBasedFactorization):
def __init__(self, data_provider, n_temporal_segments, composition_fn, model_name,
spleeter_sources_path=None, target_sr=44100):
self.model_name = model_name
self.target_sr = target_sr
sample_name = os.path.basename(data_provider.get_audio_path().replace(".mp3", ""))
if spleeter_sources_path is not None:
self.sources_path = os.path.join(spleeter_sources_path,
model_name.replace("spleeter:", ""), sample_name)
else:
self.sources_path = None
super().__init__(data_provider, n_temporal_segments, composition_fn)
def initialize_components(self):
spleeter_sr = 44100
prediction_path = None
if self.sources_path is not None:
prediction_path = os.path.join(self.sources_path, "prediction.pt")
if not prediction_path is None and os.path.exists(prediction_path):
print("loading {} ...".format(prediction_path))
prediction = pickle.load(open(prediction_path, "rb"))
else:
waveform = self.data_provider.get_mix()
separator = Separator(self.model_name, multiprocess=False)
prediction = separate(separator, waveform, self.target_sr, spleeter_sr)
if not prediction_path is None: # need to store
if not os.path.exists(self.sources_path):
os.mkdir(self.sources_path)
pickle.dump(prediction, open(prediction_path, "wb"))
self.original_components = [
librosa.resample(np.mean(prediction[key], axis=1), spleeter_sr, self.target_sr) for
key in prediction]
self._components_names = list(prediction.keys())
# Number of distortion tries, the more the better
num_samples = 128
# ID of the audio sample
audio_id = 50952
pbar = tqdm(total=num_samples)
audio_str = str(audio_id).zfill(6)
audio_path = f"{datasource}/fma_large/{audio_str[:3]}/{audio_str}.mp3"
data_provider = FMAAudioProvider(audio_path)
# We split into 3s samples
spleeter_factorization = FMAFactorization(data_provider,
n_temporal_segments=10,
composition_fn=None,
model_name='spleeter:5stems',
target_sr=44100)
explainer = lime_audio.LimeAudioExplainer(verbose=True, absolute_feature_sort=False)
explanation = explainer.explain_instance(factorization=spleeter_factorization,
predict_fn=predict_fn,
labels=["unpopular", "average", "popular"],
top_labels=3,
num_samples=num_samples,
batch_size=64
)
:¶
pbar.close()
HBox(children=(FloatProgress(value=0.0, max=1024.0), HTML(value='')))
INFO:tensorflow:Apply unet for vocals_spectrogram INFO:tensorflow:Apply unet for piano_spectrogram INFO:tensorflow:Apply unet for drums_spectrogram INFO:tensorflow:Apply unet for bass_spectrogram INFO:tensorflow:Apply unet for other_spectrogram INFO:tensorflow:Restoring parameters from pretrained_models/5stems/model
/home/mbien/.local/lib/python3.8/site-packages/audioLIME/factorization.py:84: UserWarning: last 8 samples are ignored warnings.warn("last {} samples are ignored".format(audio_length - explained_length))
HBox(children=(FloatProgress(value=0.0, description='SUBMITTING | ', max=256.0, style=ProgressStyle(descriptio…
HBox(children=(FloatProgress(value=0.0, description='PROCESSING | ', max=256.0, style=ProgressStyle(descriptio…
labels = list(explanation.local_exp.keys())
rlabel = ["average", "unpopular", "popular"]
for label in labels:
print(label)
top_components, component_indeces = explanation.get_sorted_components(label,
positive_components=True,
negative_components=False,
num_components=30,
return_indeces=True)
print(f"Label {rlabel[label]}: {len(top_components)} components found")
1 Label unpopular: 23 components found 2 Label popular: 0 components found 0 Label average: 27 components found
output
folder¶!rm -rf output
!mkdir -p output
for label in labels:
top_components, component_indeces = explanation.get_sorted_components(label,
positive_components=True,
negative_components=False,
num_components=5,
return_indeces=True)
#print(top_components)
if len(top_components) > 0:
sf.write(os.path.join("output", f"explanation_{rlabel[label]}.wav"), sum(top_components), 44100)
sf.write(os.path.join("output", "original.wav"), spleeter_factorization.data_provider.get_mix(), 44100)
# play the most important components of the original song
exp, sr = librosa.load("output/original.wav", sr=None, mono=True)
display.Audio(data=exp, rate=sr)
# play the "unpopular" explaination
exp, sr = librosa.load("output/explanation_unpopular.wav", sr=None, mono=True)
display.Audio(data=exp, rate=sr)
# play the "average" explaination
exp, sr = librosa.load("output/explanation_average.wav", sr=None, mono=True)
display.Audio(data=exp, rate=sr)
# # play the "popular" explaination
# exp, sr = librosa.load("output/explanation_popular.wav", sr=None, mono=True)
# display.Audio(data=exp, rate=sr)
To run this code, you must have an undersampled version of fma_large
dataset, which can be created using the code from milestone3-wav2vec2.ipynb
feature_extractor = Wav2Vec2FeatureExtractor(feature_size=1, sampling_rate=16000, padding_value=0.0, max_length=480000)
min_train = -3.3939661205367715
max_train = 5.098214985136099
class PopularityDataset(torch.utils.data.Dataset):
def __init__(self, dataframe):
self.dataframe = dataframe
def __len__(self):
return len(self.dataframe)
def __getitem__(self, index):
row = self.dataframe.iloc[index]
track_id = str(int(row.track_id)).zfill(6)
audio_sample, _ = torchaudio.load(f"{datasource}/musicology-dataset-downsampled/{track_id}.mp3")
audio_sample = torch.mean(audio_sample, dim=0, keepdim=True)
audio_sample = audio_sample.squeeze().numpy()
audio_sample = feature_extractor(audio_sample, sampling_rate=16000, return_tensors="pt", padding='max_length', max_length=480000).input_values
#score = (row.pop_score - min_train)/(max_train-min_train)
return {
"input_ids": audio_sample.flatten()[:480000],
"labels": torch.tensor(row.pop_score).float(),
}
# Ensure we are using the same split as the one used for finetuning
test = pd.read_csv("https://mbien-public.s3.eu-central-1.amazonaws.com/dh-401/pop_test.csv")
test_ds = PopularityDataset(test[:10])
test.head()
track_id | pop_score | |
---|---|---|
0 | 147246 | 0.361502 |
1 | 15577 | -0.022418 |
2 | 127407 | -0.745016 |
3 | 131643 | 1.275226 |
4 | 116637 | -0.683588 |
Only difference compared to the training code is no Dropout
class Wav2Vec2ForAudioClassification(Wav2Vec2PreTrainedModel):
def __init__(self, config):
super().__init__(config)
self.wav2vec2 = Wav2Vec2Model(config)
self.dropout = nn.Dropout(0)
self.classifier = nn.Linear(768, 1)
self.init_weights()
def forward(self, input_ids):
outputs = self.wav2vec2(
input_ids,
output_attentions=True,
output_hidden_states=True
)
pooled_output = outputs.last_hidden_state[:,0,:]
return self.classifier(pooled_output)
model = Wav2Vec2ForAudioClassification.from_pretrained("mbien/fma2vec2popularity").to("cuda")
test_dl = DataLoader(test_ds)
for sample in test_dl:
model.eval()
out = model(sample["input_ids"].to("cuda"))
print("Original:", sample["labels"].item(), "Predicted:", out)
Original: 0.3615018427371979 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: -0.02241761051118374 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: -0.7450163960456848 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: 1.275226354598999 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: -0.6835883855819702 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: 0.2886952757835388 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: -1.2600702047348022 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: -0.020840361714363098 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: -0.7316892147064209 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>) Original: 1.057540774345398 Predicted: tensor([[-0.1074]], device='cuda:0', grad_fn=<AddmmBackward>)
We tested also:
None of this options allowed to fine-tune the model with performance above average.