In October 2015, a data journalist named Walt Hickey analyzed movie ratings data and found strong evidence to suggest that an online movie ratings aggregator called Fandango was biased and dishonest. He published his analysis in this article.
Fandango displays a 5-star rating system on their website, where the minimum rating is 0 stars and the maximum is 5 stars as indicated in the picture below.
Hickey found that there's a significant discrepancy between the number of stars displayed to users and the actual rating, which he was able to find in the HTML of the page. As displayed in the distributions below.
The goal of this project is to analyze more recent movie ratings data to determine whether there has been any change in Fandango's rating system after Hickey's analysis. We want to be able to tell if the statement made by Fandango's official that the biased rounding rating was caused by a big in their system was true or they were just trying to cover up the truth.
One of the best ways to find out whether there has been any change in Fandago's movie rating after Hickey's analysis is to compare how publised rating developed before and after the article. To do that, we will use the following datasets which are already available to us.
#importing the relevent libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from numpy import arange
%matplotlib inline
We will read in the two datasets (fandango_score_comparison.csv and movie_ratings_16_17.csv) and explore them briefly to understand their structure.
#Dataset before the Hickey's article
Before_article = pd.read_csv('fandango_score_comparison.csv')
Before_article.info() #understanding the dataset's column content
<class 'pandas.core.frame.DataFrame'> RangeIndex: 146 entries, 0 to 145 Data columns (total 22 columns): FILM 146 non-null object RottenTomatoes 146 non-null int64 RottenTomatoes_User 146 non-null int64 Metacritic 146 non-null int64 Metacritic_User 146 non-null float64 IMDB 146 non-null float64 Fandango_Stars 146 non-null float64 Fandango_Ratingvalue 146 non-null float64 RT_norm 146 non-null float64 RT_user_norm 146 non-null float64 Metacritic_norm 146 non-null float64 Metacritic_user_nom 146 non-null float64 IMDB_norm 146 non-null float64 RT_norm_round 146 non-null float64 RT_user_norm_round 146 non-null float64 Metacritic_norm_round 146 non-null float64 Metacritic_user_norm_round 146 non-null float64 IMDB_norm_round 146 non-null float64 Metacritic_user_vote_count 146 non-null int64 IMDB_user_vote_count 146 non-null int64 Fandango_votes 146 non-null int64 Fandango_Difference 146 non-null float64 dtypes: float64(15), int64(6), object(1) memory usage: 25.2+ KB
Before_article.head(3)
FILM | RottenTomatoes | RottenTomatoes_User | Metacritic | Metacritic_User | IMDB | Fandango_Stars | Fandango_Ratingvalue | RT_norm | RT_user_norm | ... | IMDB_norm | RT_norm_round | RT_user_norm_round | Metacritic_norm_round | Metacritic_user_norm_round | IMDB_norm_round | Metacritic_user_vote_count | IMDB_user_vote_count | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Avengers: Age of Ultron (2015) | 74 | 86 | 66 | 7.1 | 7.8 | 5.0 | 4.5 | 3.70 | 4.3 | ... | 3.90 | 3.5 | 4.5 | 3.5 | 3.5 | 4.0 | 1330 | 271107 | 14846 | 0.5 |
1 | Cinderella (2015) | 85 | 80 | 67 | 7.5 | 7.1 | 5.0 | 4.5 | 4.25 | 4.0 | ... | 3.55 | 4.5 | 4.0 | 3.5 | 4.0 | 3.5 | 249 | 65709 | 12640 | 0.5 |
2 | Ant-Man (2015) | 80 | 90 | 64 | 8.1 | 7.8 | 5.0 | 4.5 | 4.00 | 4.5 | ... | 3.90 | 4.0 | 4.5 | 3.0 | 4.0 | 4.0 | 627 | 103660 | 12055 | 0.5 |
3 rows × 22 columns
#Dataset after the Hickey's article
After_article = pd.read_csv('movie_ratings_16_17.csv')
After_article.info() #understanding the dataset's column content
<class 'pandas.core.frame.DataFrame'> RangeIndex: 214 entries, 0 to 213 Data columns (total 15 columns): movie 214 non-null object year 214 non-null int64 metascore 214 non-null int64 imdb 214 non-null float64 tmeter 214 non-null int64 audience 214 non-null int64 fandango 214 non-null float64 n_metascore 214 non-null float64 n_imdb 214 non-null float64 n_tmeter 214 non-null float64 n_audience 214 non-null float64 nr_metascore 214 non-null float64 nr_imdb 214 non-null float64 nr_tmeter 214 non-null float64 nr_audience 214 non-null float64 dtypes: float64(10), int64(4), object(1) memory usage: 25.2+ KB
After_article.head(3)
movie | year | metascore | imdb | tmeter | audience | fandango | n_metascore | n_imdb | n_tmeter | n_audience | nr_metascore | nr_imdb | nr_tmeter | nr_audience | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 10 Cloverfield Lane | 2016 | 76 | 7.2 | 90 | 79 | 3.5 | 3.80 | 3.60 | 4.5 | 3.95 | 4.0 | 3.5 | 4.5 | 4.0 |
1 | 13 Hours | 2016 | 48 | 7.3 | 50 | 83 | 4.5 | 2.40 | 3.65 | 2.5 | 4.15 | 2.5 | 3.5 | 2.5 | 4.0 |
2 | A Cure for Wellness | 2016 | 47 | 6.6 | 40 | 47 | 3.0 | 2.35 | 3.30 | 2.0 | 2.35 | 2.5 | 3.5 | 2.0 | 2.5 |
Since our interest is to analyse information about Fandango's movie ratings, then we will only consider columns that offer such information in the two datasets.
Fandango_before = Before_article[['FILM', 'Fandango_Stars', 'Fandango_Ratingvalue',
'Fandango_votes','Fandango_Difference']].copy()
Fandango_before.head(3)
FILM | Fandango_Stars | Fandango_Ratingvalue | Fandango_votes | Fandango_Difference | |
---|---|---|---|---|---|
0 | Avengers: Age of Ultron (2015) | 5.0 | 4.5 | 14846 | 0.5 |
1 | Cinderella (2015) | 5.0 | 4.5 | 12640 | 0.5 |
2 | Ant-Man (2015) | 5.0 | 4.5 | 12055 | 0.5 |
2 For the after the article dataset, we will select the following columns:
Fandango_after = After_article[['movie', 'year','fandango']].copy()
Fandango_after.head(3)
movie | year | fandango | |
---|---|---|---|
0 | 10 Cloverfield Lane | 2016 | 3.5 |
1 | 13 Hours | 2016 | 4.5 |
2 | A Cure for Wellness | 2016 | 3.0 |
Our goal is to determine whether there has been any change in Fandango's rating system after Hickey's analysis. The population of interest is composed of all movie ratings stored on Fandago website irrespective of the year the movie was released.
Based on our goal, we are interested in sampling the population at two different time periods - Before Walt Hickey's analysis and after Walt Hickey's Analysis. In order to draw a correct and appropriate conclusion about the entire population, we need to make sure that our samples are as representative as possible, otherwise we will encouter a sampling error and draw a wrong conclusion
a) For the dataset before Hickey's analysis we have a sample of 146 films with the selected five columns representing information relevent to Fandango's movie rating.
According to the README.md file of the dataset's repository and Hickey's article the following sampling criteria were used:
Clearly, this sample was not random because, not every movie had an chance to be included in a sample (e.g movies with less than 30 reviews on Fandango's website before Aug. 24, 2015 and movies without tickets on sale in 2015 had no chance of being included in the sample). We can conclude that this sample is not a reprentative of the entire population.
b) Similarly, for the dataset after Hickey's analysis we have a population of 214 movies with the selected 3 columns representing information relevent to Fandango's movie rating.
According to the README.md file of the dataset's repository the following sampling criteria were used:
Just like the first sample, this sample was also not random because, not every movie had an chance to be included in a sample. This sample is also unlikely to be a representative sample for the entire population.
Both Walt Hickey and the Dataquest team member who collected the second dataset might have had certain research questions they needed answer to when they sampled the data, and they used a set of criteria to get a sample that best answer their questions. They used a non-probability sampling technique (in this case purposive sampling). Although, these sample might have been good enough for their research, we wouldnt consider them to be useful for our analysis.
Although, we have established that the sampling processes used were not random, and the resulting samples were unlikely to be representative of the entire population we will not abandon the research, but we will carry on and try to tweak our goal by coming up with creative workarounds.
In this case, we will change our goal to finding out whether there is any difference between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016. Furthermore, we will also look at how Fandango ratings compare to other ratings in 2016.
With the new goal, we now have two populations that we want to describe and compare with each other:
We will use Hickey's benchmark of 30 fan ratings and consider a movie as "popular" only if it has 30 fan ratings or more on Fandango's website.
One of the sampling criteria on our second sample is popularity, but the sample does not provide information about the number of fan ratings. To check the representativity of this sample we will sample randomly 10 movies from it check the number of fun ratings on Fandago's website ourselves. We find that at least 8 out of the 10 movies have 30 fan ratings or more.
Fandango_after.sample(10, random_state=1)
movie | year | fandango | |
---|---|---|---|
108 | Mechanic: Resurrection | 2016 | 4.0 |
206 | Warcraft | 2016 | 4.0 |
106 | Max Steel | 2016 | 3.5 |
107 | Me Before You | 2016 | 4.5 |
51 | Fantastic Beasts and Where to Find Them | 2016 | 4.5 |
33 | Cell | 2016 | 3.0 |
59 | Genius | 2016 | 3.5 |
152 | Sully | 2016 | 4.5 |
4 | A Hologram for the King | 2016 | 3.0 |
31 | Captain America: Civil War | 2016 | 4.5 |
We can be confident that atleast 90% of the movies in our sample are popular.
Exploring the datasets roughly, we find that some of the movies were not released in 2015 and 2016. We will consider only the sample points that belong to our popuulation of interest.
We will start with Hickey's data set and isolate only the movies released in 2015. There is no column for release year in this dataset, but the year can be extracted in the 'FILM' column.
Fandango_before['year'] = Fandango_before['FILM'].str[-5:-1]
Fandango_before.head(3)
FILM | Fandango_Stars | Fandango_Ratingvalue | Fandango_votes | Fandango_Difference | year | |
---|---|---|---|---|---|---|
0 | Avengers: Age of Ultron (2015) | 5.0 | 4.5 | 14846 | 0.5 | 2015 |
1 | Cinderella (2015) | 5.0 | 4.5 | 12640 | 0.5 | 2015 |
2 | Ant-Man (2015) | 5.0 | 4.5 | 12055 | 0.5 | 2015 |
Now we can isolate the movies released in 2015 and 2016.
Fandango_2015 = Fandango_before[Fandango_before['year']=='2015']
Fandango_2015.head(3)
FILM | Fandango_Stars | Fandango_Ratingvalue | Fandango_votes | Fandango_Difference | year | |
---|---|---|---|---|---|---|
0 | Avengers: Age of Ultron (2015) | 5.0 | 4.5 | 14846 | 0.5 | 2015 |
1 | Cinderella (2015) | 5.0 | 4.5 | 12640 | 0.5 | 2015 |
2 | Ant-Man (2015) | 5.0 | 4.5 | 12055 | 0.5 | 2015 |
Fandango_2016 = Fandango_after[Fandango_after['year']==2016]
Fandango_2016.head()
movie | year | fandango | |
---|---|---|---|
0 | 10 Cloverfield Lane | 2016 | 3.5 |
1 | 13 Hours | 2016 | 4.5 |
2 | A Cure for Wellness | 2016 | 3.0 |
4 | A Hologram for the King | 2016 | 3.0 |
5 | A Monster Calls | 2016 | 4.0 |
Since our goal is to determine whether there is a significance difference between Fandango's movie ratings for popular movies in 2015 and 2016, we will make use of comparison between the shapes of the distribution of both samples.
plt.style.use('fivethirtyeight')
Fandango_2015['Fandango_Stars'].plot.kde(label = '2015', legend = True, figsize = (7,5))
Fandango_2016['fandango'].plot.kde(label = '2016', legend = True)
plt.title("Comparing distribution shapes for Fandango's ratings\n 2015 vs 2016",
y = 1.07)
plt.xlabel('Stars')
plt.xlim(0,5)
plt.xticks(arange(0,5.1,.5))
plt.show()
From the figure above, we can deduce the following:
The left skewedness implies that movies on Fandango website are given high fan ratings. Thinking of the fact that Fandango sells movies tickets, the high rating seems very suspicius (Hickey's finding might be true, further investigation is needed).
The slight shift to the left of the 2016 distribution shows that ratings were slightly lower in 2016 as compared to 2015. This implies that there was a difference between Fandango's rating for popular movies in 2015 and 2016. The direction of the difference indicate that: Fandango movies ratings were slightly lower in 2016 compared to 2015.
The kernel density plots above showed us that there is a clear difference between the two distribution as well as providing information about the direction of the diffence; which was a great start. Further, we will analyze in more details.
We will examine the frequency distribution tables of the two distributions.
Freq_table_2015 = 100*Fandango_2015['Fandango_Stars'].value_counts(normalize = True).sort_index()
print('Frequency table for 2015' + '\n\n' + '*' * 16)
Freq_table_2015
Frequency table for 2015 ****************
3.0 8.527132 3.5 17.829457 4.0 28.682171 4.5 37.984496 5.0 6.976744 Name: Fandango_Stars, dtype: float64
Freq_table_2016 = 100*Fandango_2016['fandango'].value_counts(normalize = True).sort_index()
print('Frequency table for 2016' + '\n\n' + '*' * 16)
Freq_table_2016
Frequency table for 2016 ****************
2.5 3.141361 3.0 7.329843 3.5 24.083770 4.0 40.314136 4.5 24.607330 5.0 0.523560 Name: fandango, dtype: float64
For better comparison we will combine the two distribution tables in one dataframe.
Combined_freq_tables= pd.concat([Freq_table_2015, Freq_table_2016], axis = 1).fillna(0)
Combined_freq_tables.rename(columns={"Fandango_Stars": "2015", "fandango": "2016"}, inplace = True)
Combined_freq_tables
2015 | 2016 | |
---|---|---|
2.5 | 0.000000 | 3.141361 |
3.0 | 8.527132 | 7.329843 |
3.5 | 17.829457 | 24.083770 |
4.0 | 28.682171 | 40.314136 |
4.5 | 37.984496 | 24.607330 |
5.0 | 6.976744 | 0.523560 |
plt.style.use('fivethirtyeight')
Combined_freq_tables['2015'].plot.bar(color = '#0000CD', align = 'center', label = '2015', width = .25)
Combined_freq_tables['2016'].plot.bar(color = '#66CD00', align = 'edge', label = '2016', width = .25,
rot = 0, figsize = (8,5))
plt.title('Fandango Movies Ratings(relative percentage)', fontsize = 20, y=1.06)
plt.legend(loc = 'upper right')
plt.xticks(np.arange(0, 5.5, step=0.5))
plt.xlabel('Rating')
plt.ylabel('percentage(%)')
plt.title('Fandango Movies Ratings(relative percentage)', fontsize = 20)
plt.show()
For the dataframe and the bar plot above, we can deduce the following:
Clearly, there is a difference between the two distributions. For some ratings, the percentages went up in 2016 while some it went down. This challenges the direction of the difference we observed on the two plots above.
From the dataframe above we confirmed there is indeed a difference between the two distributions. However, the direction of the difference is not as clear as it was on the kernel density plots. Now, we will take a couple of summary statistics to get a more precise picture about the direction of the difference. We'll take each distribution of movie ratings and compute its mean, median, and mode, and then compare these statistics to determine what they tell about the direction of the difference.
# Computing the means
mean_2015 = Fandango_2015['Fandango_Stars'].mean()
mean_2016 = Fandango_2016['fandango'].mean()
# Computing the median
median_2015 = Fandango_2015['Fandango_Stars'].median()
median_2016 = Fandango_2016['fandango'].median()
# Computing the modes
mode_2015 = Fandango_2015['Fandango_Stars'].mode()[0] # the output of Series.mode() is a bit uncommon
mode_2016 = Fandango_2016['fandango'].mode()[0]
#Creating the dataframe of the computed statistics
summary = pd.DataFrame()
summary['2015'] = [mean_2015, median_2015, mode_2015]
summary['2016'] = [mean_2016, median_2016, mode_2016]
summary.index = ['mean', 'median', 'mode']
summary
2015 | 2016 | |
---|---|---|
mean | 4.085271 | 3.887435 |
median | 4.000000 | 4.000000 |
mode | 4.500000 | 4.000000 |
plt.style.use('fivethirtyeight')
summary['2015'].plot.bar(color = '#0000CD', align = 'center', label = '2015', width = .25)
summary['2016'].plot.bar(color = '#66CD00', align = 'edge', label = '2016', width = .25,
rot = 0, figsize = (8,5))
plt.title('Comparing summary statistics: 2015 vs 2016', y = 1.06)
plt.ylim(0,5.5)
plt.yticks(arange(0,5.1,.5))
plt.ylabel('Stars')
plt.legend(framealpha = 0, loc = 'upper center')
plt.show()
From the table above, we can deduce that the mean rating was lower in 2016 with approximately 0.2. This means a drop of almost 5% relative to the mean rating in 2015. The median rating remains the same for 2015 and 2016. The most frequently used rating in 2016 was 4.0 stars which is lower compared to the 4.5 stars in 2015. This confirm the direction we saw on the kerbel density function that on average, popular movies movies released in 2016 were rated slightly lower than those released in 2015.
One of Walt Hickey's findings is that Fandango ratings in 2015 were significantly higher than ratings by others for the same movies as shown in figure below (adopted from Hickey's article).
Lucky for us, the 2016 dataset also includes several ratings from other sites. Before we draw an conlusion it quite interesting for us see how the picture looks like after Hickey's article by creating a kernel density plot comparing Fandango ratings with several other ratings appeared in the graph above.
Fandango_2016 = After_article[After_article['year']==2016]
plt.style.use('fivethirtyeight')
Fandango_2016['fandango'].plot.kde(label = 'Fandango')
Fandango_2016['n_audience'].plot.kde(label = 'Rotten Tomatoes users')
Fandango_2016['n_imdb'].plot.kde(label = 'IMDB')
Fandango_2016['n_tmeter'].plot.kde(label = 'Rotten Tomatoes critics')
Fandango_2016['n_metascore'].plot.kde(label = 'Metacritic', figsize = (12,7))
plt.legend()
plt.xlim(0,5)
plt.xticks(np.arange(0, 5.5, step=0.5))
plt.ylim(0,1)
plt.yticks(np.arange(0, 1, step=0.1))
plt.xlabel('Ratings')
plt.title('Fandango ratings vs other ratings in 2016',weight='bold', fontsize = 20)
plt.legend(framealpha = 0, loc = 'best')
plt.show()
Fandango ratings in 2016 are still higher than several other ratings.
Based on our analysis there is a slight difference between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016. We were able to deduce that, on average, popular movies released in 2016 were rated lower than popular movies released in 2015 on the Fandango website.
This clearly indicate that Fandango made some changes and Hickey's analysis might be the cause of such chnages.