Thanks to the excellent work of Rob Mitchum, I have a database full of the summer of 2015 Fesitval Circut (in the United States) and an rank value (how how in the bill) or each artist. With this, I will attempt to construct a concept of fesitval value over replacement- that is, how valuable is a particular band to a fesitval, inspired by the Wins above Replacement formulas popularized by Baseball Prospectus and Fangraphs.
# First, we need the data.
import pandas as pd
festival_df = pd.read_csv('2015 Festival Power Rankings - Sheet1.csv', skiprows=[1])
festival_df.drop(festival_df.columns[[1,2,3]],axis=1, inplace=True) # Drop some fields we don't want
festival_df.head()
ARTIST | COACHELLA | JAZZFEST | SUNFEST | SHAKYKNEES | HANGOUT | BOSCALLING | BUNBURY | GOVBALL | BONNAROO | FIREFLY | FORECASTLE | PITCHFORK | LOLLA | OUTSIDELDS | BUMBERSHT | ACL | OSHEAGA | SASQUATCH | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | My Morning Jacket | NaN | NaN | NaN | NaN | 5 | 3 | NaN | 6 | 7 | NaN | 2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | Tame Impala | 2 | NaN | NaN | 8 | NaN | 6 | 6 | 13 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | Run The Jewels | 16 | NaN | NaN | NaN | NaN | 15 | NaN | NaN | 33 | 22 | NaN | 7 | NaN | NaN | NaN | NaN | NaN | NaN |
3 | Odesza | 13 | NaN | NaN | NaN | 23 | NaN | NaN | 35 | 45 | 27 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | Sturgill Simpson | 41 | NaN | NaN | NaN | NaN | NaN | NaN | 30 | 37 | 31 | 18 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 19 columns
In the section below, we go over what is contained in the dataset. Namely, each value (for example, Tame Impala/Coachella has a value of 2) represents the row in the poster for the band. Basically, we can use it as a proxy for value - that is bands that are higher up are percieved as more valuable by the promoter.
%matplotlib inline
mod_df = festival_df.set_index(["ARTIST"]).transpose()
mod_df.describe()
ARTIST | My Morning Jacket | Tame Impala | Run The Jewels | Odesza | Sturgill Simpson | Vance Joy | Mac DeMarco | Jungle | Royal Blood | MØ | Benjamin Booker | Florence + the Machine | Ryan Adams | Brand New | St. Vincent | Hozier | TV On The Radio | Spoon | Marina and the Diamonds | War on Drugs | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 5.000000 | 5 | 5.000000 | 5.000000 | 5.000000 | 4.000000 | 4.00000 | 4.00000 | 4.000000 | 4.000000 | 4.000000 | 3.000000 | 3 | 3.000000 | 3.000000 | 3.000000 | 3.0 | 3.000000 | 3.000000 | 3.000000 | ... |
mean | 4.600000 | 7 | 18.600000 | 28.600000 | 31.400000 | 23.250000 | 31.25000 | 35.25000 | 37.750000 | 40.750000 | 50.000000 | 3.666667 | 6 | 7.666667 | 9.333333 | 10.000000 | 11.0 | 13.000000 | 15.000000 | 15.666667 | ... |
std | 2.073644 | 4 | 9.659193 | 12.116105 | 8.734987 | 11.146748 | 17.23127 | 18.35529 | 21.914607 | 16.090888 | 7.788881 | 1.527525 | 2 | 3.214550 | 4.932883 | 2.645751 | 3.0 | 4.358899 | 12.165525 | 1.527525 | ... |
min | 2.000000 | 2 | 7.000000 | 13.000000 | 18.000000 | 10.000000 | 11.00000 | 18.00000 | 17.000000 | 19.000000 | 41.000000 | 2.000000 | 4 | 4.000000 | 6.000000 | 7.000000 | 8.0 | 8.000000 | 7.000000 | 14.000000 | ... |
25% | 3.000000 | 6 | 15.000000 | 23.000000 | 30.000000 | 18.250000 | 20.75000 | 22.50000 | 25.250000 | 34.750000 | 47.000000 | 3.000000 | 5 | 6.500000 | 6.500000 | 9.000000 | 9.5 | 11.500000 | 8.000000 | 15.000000 | ... |
50% | 5.000000 | 6 | 16.000000 | 27.000000 | 31.000000 | 23.000000 | 32.00000 | 32.00000 | 33.000000 | 43.500000 | 49.500000 | 4.000000 | 6 | 9.000000 | 7.000000 | 11.000000 | 11.0 | 15.000000 | 9.000000 | 16.000000 | ... |
75% | 6.000000 | 8 | 22.000000 | 35.000000 | 37.000000 | 28.000000 | 42.50000 | 44.75000 | 45.500000 | 49.500000 | 52.500000 | 4.500000 | 7 | 9.500000 | 11.000000 | 11.500000 | 12.5 | 15.500000 | 19.000000 | 16.500000 | ... |
max | 7.000000 | 13 | 33.000000 | 45.000000 | 41.000000 | 37.000000 | 50.00000 | 59.00000 | 68.000000 | 57.000000 | 60.000000 | 5.000000 | 8 | 10.000000 | 15.000000 | 12.000000 | 14.0 | 16.000000 | 29.000000 | 17.000000 | ... |
8 rows × 529 columns
mod_df.fillna(value=0).transpose()['COACHELLA'].value_counts()
0 367 14 3 26 3 25 3 24 3 23 3 22 3 21 3 20 3 19 3 18 3 17 3 16 3 15 3 13 3 28 3 12 3 11 3 10 3 9 3 8 3 7 3 6 3 5 3 4 3 3 3 2 3 1 3 27 3 29 3 44 3 40 3 49 3 48 3 47 3 46 3 45 3 30 3 43 3 41 3 42 3 39 3 38 3 37 3 36 3 35 3 34 3 33 3 32 3 31 3 50 2 51 2 52 2 53 2 54 2 56 1 57 1 58 1 55 1 59 1 Length: 60, dtype: int64
# Histogram of Appearances per band
Using the information gathered above, we can can express the value of a band to a festival