Investigating Fandango Movie Ratings¶

Fandango displays a 5-star rating system on their website, where the minimum rating is 0 stars and the maximum is 5 stars.

In October 2015, a data journalist named Walt Hickey analyzed movie ratings data and found strong evidence to suggest that Fandango's rating system was biased and dishonest.

Fandango displays a 5-star rating system on their website, where the minimum rating is 0 stars and the maximum is 5 stars.

Hickey found that there's a significant discrepancy between the number of stars displayed to users and the actual rating, which he was able to find in the HTML of the page. He was able to find that:

The actual rating was almost always rounded up to the nearest half-star. For instance, a 4.1 movie would be rounded off to 4.5 stars, not to 4 stars, as you may expect.
In the case of 8% of the ratings analyzed, the rounding up was done to the nearest whole star. For instance, a 4.5 rating would be rounded off to 5 stars.
For one movie rating, the rounding off was completely bizarre: from a rating of 4 in the HTML of the page to a displayed rating of 5 stars.

The two distributions above are displayed using a simple line plot, which is also a valid way to show the shape of a distribution. The variable being examined is movie rating, and for each unique rating we can see its relative frequency (percentage) on the y-axis of the graph. When an analysis report is intended for large audiences, relative frequencies (especially percentages) are preferred over absolute frequencies.

Both distributions above are strongly left skewed, suggesting that movie ratings on Fandango are generally high or very high. We can see there's no rating under 2 stars in the sample Hickey analyzed. The distribution of displayed ratings is clearly shifted to the right compared to the actual rating distribution, suggesting strongly that Fandango inflates the ratings under the hood.

Fandango's officials replied that the biased rounding off was caused by a bug in their system rather than being intentional, and they promised to fix the bug as soon as possible. Presumably, this has already happened, although we can't tell for sure since the actual rating value doesn't seem to be displayed anymore in the pages' HTML.

In this project, we'll analyze more recent movie ratings data to determine whether there has been any change in Fandango's rating system after Hickey's analysis.

The data you can find in this github page: https://github.com/mircealex/Movie_ratings_2016_17

And here is critics of Walt Hicket who shared his research on his website : https://fivethirtyeight.com/features/fandango-movies-ratings/

Understanding the data¶

In [5]:

import pandas as pd
import numpy as np

In [12]:

previous = pd.read_csv("fandango_score_comparison.csv")
after = pd.read_csv("movie_ratings_16_17.csv")

In [13]:

previous.head()

Out[13]:

	FILM	RottenTomatoes	RottenTomatoes_User	Metacritic	Metacritic_User	IMDB	Fandango_Stars	Fandango_Ratingvalue	RT_norm	RT_user_norm	...	IMDB_norm	RT_norm_round	RT_user_norm_round	Metacritic_norm_round	Metacritic_user_norm_round	IMDB_norm_round	Metacritic_user_vote_count	IMDB_user_vote_count	Fandango_votes	Fandango_Difference
0	Avengers: Age of Ultron (2015)	74	86	66	7.1	7.8	5.0	4.5	3.70	4.3	...	3.90	3.5	4.5	3.5	3.5	4.0	1330	271107	14846	0.5
1	Cinderella (2015)	85	80	67	7.5	7.1	5.0	4.5	4.25	4.0	...	3.55	4.5	4.0	3.5	4.0	3.5	249	65709	12640	0.5
2	Ant-Man (2015)	80	90	64	8.1	7.8	5.0	4.5	4.00	4.5	...	3.90	4.0	4.5	3.0	4.0	4.0	627	103660	12055	0.5
3	Do You Believe? (2015)	18	84	22	4.7	5.4	5.0	4.5	0.90	4.2	...	2.70	1.0	4.0	1.0	2.5	2.5	31	3136	1793	0.5
4	Hot Tub Time Machine 2 (2015)	14	28	29	3.4	5.1	3.5	3.0	0.70	1.4	...	2.55	0.5	1.5	1.5	1.5	2.5	88	19560	1021	0.5

5 rows × 22 columns

In [14]:

after.head()

Out[14]:

	movie	year	metascore	imdb	tmeter	audience	fandango	n_metascore	n_imdb	n_tmeter	n_audience	nr_metascore	nr_imdb	nr_tmeter	nr_audience
0	10 Cloverfield Lane	2016	76	7.2	90	79	3.5	3.80	3.60	4.50	3.95	4.0	3.5	4.5	4.0
1	13 Hours	2016	48	7.3	50	83	4.5	2.40	3.65	2.50	4.15	2.5	3.5	2.5	4.0
2	A Cure for Wellness	2016	47	6.6	40	47	3.0	2.35	3.30	2.00	2.35	2.5	3.5	2.0	2.5
3	A Dog's Purpose	2017	43	5.2	33	76	4.5	2.15	2.60	1.65	3.80	2.0	2.5	1.5	4.0
4	A Hologram for the King	2016	58	6.1	70	57	3.0	2.90	3.05	3.50	2.85	3.0	3.0	3.5	3.0

We are going to useful data we can use in the dataframes, so we select the necessary columns with dataframes

In [15]:

fandango_previous = previous[['FILM', 'Fandango_Stars', 'Fandango_Ratingvalue', 'Fandango_votes', 'Fandango_Difference']].copy()
fandango_after = after[['movie', 'year', 'fandango']].copy()

In [22]:

fandango_previous.head()

Out[22]:

	FILM	Fandango_Stars	Fandango_Ratingvalue	Fandango_votes	Fandango_Difference
0	Avengers: Age of Ultron (2015)	5.0	4.5	14846	0.5
1	Cinderella (2015)	5.0	4.5	12640	0.5
2	Ant-Man (2015)	5.0	4.5	12055	0.5
3	Do You Believe? (2015)	5.0	4.5	1793	0.5
4	Hot Tub Time Machine 2 (2015)	3.5	3.0	1021	0.5

In [17]:

fandango_after.head()

Out[17]:

	movie	year	fandango
0	10 Cloverfield Lane	2016	3.5
1	13 Hours	2016	4.5
2	A Cure for Wellness	2016	3.0
3	A Dog's Purpose	2017	4.5
4	A Hologram for the King	2016	3.0

Now, our goal is to determine whether there has been any change in Fandango's rating system after Hickey's analysis

In [19]:

fandango_after.shape

Out[19]:

(214, 3)

In [21]:

fandango_previous.shape

Out[21]:

(146, 5)

Changing the Goal of our Analysis¶

At this point, we can either collect new data or change our the goal of our analysis. We choose the latter and place some limitations on our initial goal.

Instead of trying to determine whether there has been any change in Fandango's rating system after Hickey's analysis, our new goal is to determine whether there's any difference between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016. This new goal should also be a fairly good proxy for our initial goal.

Isolating the Samples We Need¶

With this new research goal, we have two populations of interest:

All Fandango's ratings for popular movies released in 2015.
All Fandango's ratings for popular movies released in 2016.

We need to be clear about what counts as popular movies. We'll use Hickey's benchmark of 30 fan ratings and count a movie as popular only if it has 30 fan ratings or more on Fandango's website.

Although one of the sampling criteria in our second sample is movie popularity, the sample doesn't provide information about the number of fan ratings. We should be skeptical once more and ask whether this sample is truly representative and contains popular movies (movies with over 30 fan ratings).

One quick way to check the representativity of this sample is to sample randomly 10 movies from it and then check the number of fan ratings ourselves on Fandango's website. Ideally, at least 8 out of the 10 movies have 30 fan ratings or more.

In [24]:

fandango_after.sample(10, random_state = 1)

Out[24]:

	movie	year	fandango
108	Mechanic: Resurrection	2016	4.0
206	Warcraft	2016	4.0
106	Max Steel	2016	3.5
107	Me Before You	2016	4.5
51	Fantastic Beasts and Where to Find Them	2016	4.5
33	Cell	2016	3.0
59	Genius	2016	3.5
152	Sully	2016	4.5
4	A Hologram for the King	2016	3.0
31	Captain America: Civil War	2016	4.5

In [25]:

sum(fandango_previous['Fandango_votes'] < 30)

Out[25]:

If you explore the two data sets, you'll notice that there are movies with a releasing year different than 2015 or 2016. For our purposes, we'll need to isolate only the movies released in 2015 and 2016.

Let's start with Hickey's data set and isolate only the movies released in 2015. There's no special column for the releasing year, but we should be able to extract it from the strings in the FILM column.

In [26]:

fandango_previous.head(2)

Out[26]:

	FILM	Fandango_Stars	Fandango_Ratingvalue	Fandango_votes	Fandango_Difference
0	Avengers: Age of Ultron (2015)	5.0	4.5	14846	0.5
1	Cinderella (2015)	5.0	4.5	12640	0.5

In [27]:

fandango_previous['Year'] = fandango_previous['FILM'].str[-5:-1]
fandango_previous.head(2)

Out[27]:

	FILM	Fandango_Stars	Fandango_Ratingvalue	Fandango_votes	Fandango_Difference	Year
0	Avengers: Age of Ultron (2015)	5.0	4.5	14846	0.5	2015
1	Cinderella (2015)	5.0	4.5	12640	0.5	2015

Let's examine the frequency distribution for the Year column and then isolate the movies released in 2015.

In [33]:

fandango_previous['Year'].value_counts()

Out[33]:

2015    129
2014     17
Name: Year, dtype: int64

In [34]:

fandango_2015 = fandango_previous[fandango_previous['Year'] == '2015'].copy()
fandango_2015['Year'].value_counts()

Out[34]:

2015    129
Name: Year, dtype: int64

In [35]:

fandango_after.head()

Out[35]:

	movie	year	fandango
0	10 Cloverfield Lane	2016	3.5
1	13 Hours	2016	4.5
2	A Cure for Wellness	2016	3.0
3	A Dog's Purpose	2017	4.5
4	A Hologram for the King	2016	3.0

In [36]:

fandango_after['year'].value_counts()

Out[36]:

2016    191
2017     23
Name: year, dtype: int64

In [37]:

fandango_2016 = fandango_after[fandango_after['year'] == 2016].copy()
fandango_2016['year'].value_counts()

Out[37]:

2016    191
Name: year, dtype: int64

Comparing Distribution Shapes for 2015 and 2016¶

Our aim is to figure out whether there's any difference between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016. One way to go about is to analyze and compare the distributions of movie ratings for the two samples.

We'll start with comparing the shape of the two distributions using kernel density plots.

In [39]:

import matplotlib.pyplot as plt
from numpy import arange
%matplotlib inline
plt.style.use('fivethirtyeight')

fandango_2015['Fandango_Stars'].plot.kde(label = '2015', legend = True, figsize = (8,5.5))
fandango_2016['fandango'].plot.kde(label = '2016', legend = True)

plt.title("Comparing distribution shapes for Fandango's ratings\n(2015 vs 2016)",
          y = 1.07) # the `y` parameter pads the title upward
plt.xlabel('Stars')
plt.xlim(0,5) # because ratings start at 0 and end at 5
plt.xticks(arange(0,5.1,.5))
plt.show()

The kernel density plots from the previous screen showed that there's a clear difference between the two distributions. They also provided us with information about the direction of the difference: movies in 2016 were rated slightly lower than those in 2015.

While comparing the distributions with the help of the kernel density plots was a great start, we now need to analyze more granular information.

Two aspects are striking on the figure above:

Both distributions are strongly left skewed.
The 2016 distribution is slightly shifted to the left relative to the 2015 distribution.

The left skew suggests that movies on Fandango are given mostly high and very high fan ratings. Coupled with the fact that Fandango sells tickets, the high ratings are a bit dubious. It'd be really interesting to investigate this further — ideally in a separate project, since this is quite irrelevant for the current goal of our analysis.

The slight left shift of the 2016 distribution is very interesting for our analysis. It shows that ratings were slightly lower in 2016 compared to 2015. This suggests that there was a difference indeed between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016. We can also see the direction of the difference: the ratings in 2016 were slightly lower compared to 2015.

Comparing Relative Frequencies¶

It seems we're following a good thread so far, but we need to analyze more granular information. Let's examine the frequency tables of the two distributions to analyze some numbers. Because the data sets have different numbers of movies, we normalize the tables and show percentages instead.

In [44]:

#To help us distinguish between the two tables immediately
print('2015' + '\n' + '-' * 20)
                              
fandango_2015['Fandango_Stars'].value_counts(normalize = True).sort_index() * 100

2015
--------------------

Out[44]:

3.0     8.527132
3.5    17.829457
4.0    28.682171
4.5    37.984496
5.0     6.976744
Name: Fandango_Stars, dtype: float64

In [45]:

print('2016' + '\n' + '-' * 16)
fandango_2016['fandango'].value_counts(normalize = True).sort_index() * 100

2016
----------------

Out[45]:

2.5     3.141361
3.0     7.329843
3.5    24.083770
4.0    40.314136
4.5    24.607330
5.0     0.523560
Name: fandango, dtype: float64

In 2016, very high ratings (4.5 and 5 stars) had significantly lower percentages compared to 2015. In 2016, under 1% of the movies had a perfect rating of 5 stars, compared to 2015 when the percentage was close to 7%. Ratings of 4.5 were also more popular in 2015 — there were approximately 13% more movies rated with a 4.5 in 2015 compared to 2016.

The minimum rating is also lower in 2016 — 2.5 instead of 3 stars, the minimum of 2015. There clearly is a difference between the two frequency distributions.

For some other ratings, the percentage went up in 2016. There was a greater percentage of movies in 2016 that received 3.5 and 4 stars, compared to 2015. 3.5 and 4.0 are high ratings and this challenges the direction of the change we saw on the kernel density plots.

Determining the Direction of the Change¶

Let's take a couple of summary metrics to get a more precise picture about the direction of the change. In what follows, we'll compute the mean, the median, and the mode for both distributions and then use a bar graph to plot the values.

In [47]:

mean_2015 = fandango_2015['Fandango_Stars'].mean()
mean_2016 = fandango_2016['fandango'].mean()

median_2015 = fandango_2015['Fandango_Stars'].median()
median_2016 = fandango_2016['fandango'].median()

mode_2015 = fandango_2015['Fandango_Stars'].mode()[0]
mode_2016 = fandango_2016['fandango'].mode()[0]

summary = pd.DataFrame()
summary['2015'] = [mean_2015, median_2015, mode_2015]
summary['2016'] = [mean_2016, median_2016, mode_2016]
summary.index = ['mean', 'median', 'mode']
print(summary)

            2015      2016
mean    4.085271  3.887435
median  4.000000  4.000000
mode    4.500000  4.000000

In [49]:

plt.style.use('fivethirtyeight')
summary['2015'].plot.bar(color = '#0066FF', align = 'center', label = '2015', width = .25)
summary['2016'].plot.bar(color = '#CC0000', align = 'edge', label = '2016', width = .25,
                         rot = 0, figsize = (8,5))

plt.title('Comparing summary statistics: 2015 vs 2016', y = 1.07)
plt.ylim(0,5.5)
plt.yticks(arange(0,5.1,.5))
plt.ylabel('Stars')
plt.legend(framealpha = 0, loc = 'upper center')
plt.show()

The mean rating was lower in 2016 with approximately 0.2. This means a drop of almost 5% relative to the mean rating in 2015.

In [50]:

(summary.loc['mean'][0] - summary.loc['mean'][1]) / summary.loc['mean'][0]

Out[50]:

0.04842683568951993

While the median is the same for both distributions, the mode is lower in 2016 by 0.5. Coupled with what we saw for the mean, the direction of the change we saw on the kernel density plot is confirmed: on average, popular movies released in 2016 were rated slightly lower than popular movies released in 2015.

Conclusion¶

Our analysis showed that there's indeed a slight difference between Fandango's ratings for popular movies in 2015 and Fandango's ratings for popular movies in 2016. We also determined that, on average, popular movies released in 2016 were rated lower on Fandango than popular movies released in 2015.

We cannot be completely sure what caused the change, but the chances are very high that it was caused by Fandango fixing the biased rating system after Hickey's analysis.

In [ ]: