Standard Deviation vs. Interquartile Range

The standard deviation and interquartile range are two measures of the spread of a distribution. It is potentially misleading to rely on standard deviation to assess player consistency in fantasy football for a couple of reasons.

Standard deviation is the square root of variance, which is the sum of squares of the deviation from the mean. In small sample sizes, such as a NFL season, a single outlier game can distort the mean. There is a related problem that the reader may assume that 68% of the values are within one standard deviation of the mean, which does not hold if the values are not normally distributed.

Consider the following player, who has a random score between 10-15 in 15 games and then a 40-point explosion in week 17. This player is extremely consistent, and the "inconsistent" game is not problematic from a fantasy perspective, as there is no downside from a player having an occasional big game.

In [28]:
gamelog1 <- c(sample(10:15, 15, replace = TRUE), 40)
gamelog1
  1. 14
  2. 12
  3. 12
  4. 11
  5. 10
  6. 11
  7. 12
  8. 10
  9. 13
  10. 13
  11. 15
  12. 10
  13. 15
  14. 11
  15. 10
  16. 40
In [29]:
library(psych)
describe(gamelog1)
varsnmeansdmediantrimmedmadminmaxrangeskewkurtosisse
X11 16 13.6875 7.21774412 12.071432.2239 10 40 30 2.9832788.0601591.804436

Using standard deviation as a measure of consistency presents a misleading picture in this case because the mean is artifically inflated by one outlier. It also suggests that 11 of the player's games (16 x .68) fall between 6.5 (13.7-7.2) and 20.9 (13.7 + 7.2) fantasy points, which overstates the amount of week-to-week variance because the minimum value is 10 and only 1 game falls outside of the range of 10-15.

Interquartile range (IQR) shows that 50% of the values are between 10.75 and 13.25 which is more accurate, and probably says more about consistency than deviation from the mean that is skewed by an outlier.

In [30]:
summary(gamelog1)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.00   10.75   12.00   13.69   13.25   40.00 

Consider another player who is more inconsistent on a weekly basis, but does not have a big game. Standard deviation depicts this player as much more consistent than our previous player, despite this not really being the case. IQR is higher for player 2 than for player 1, despite a much lower standard deviation.

In [35]:
gamelog2 <- sample(7:17, 16, replace = TRUE)
gamelog2
  1. 10
  2. 17
  3. 13
  4. 11
  5. 11
  6. 11
  7. 16
  8. 9
  9. 11
  10. 15
  11. 17
  12. 12
  13. 17
  14. 14
  15. 10
  16. 7
In [36]:
describe(gamelog2)
varsnmeansdmediantrimmedmadminmaxrangeskewkurtosisse
X11 16 12.5625 3.119161 11.5 12.64286 2.9652 7 17 10 0.1109729-1.3023460.7797903
In [37]:
summary(gamelog2)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   7.00   10.75   11.50   12.56   15.25   17.00 
In [ ]: