Notebook

Regression Evaluation Metrics¶

Descriptions and analysis of popular evalution metrics for measuring performance of regresssion models.

In general the most common metrics for regression model evaluation are:

$R^2$ : Coefficient of determination
Adjusted $R^2$ : $R^2$ with adjustments for overfitting
MSE: Mean Squared Error
RMSE: Root Mean Squared Error
RMSLE: Root Mean Squared Logrithmic Error
MAE: Mean Absolute Error

Resources for More information:

Equations¶

Note: To reduce clutter we're just using Sigma for summation from 1 to N, so $\sum = \sum_{i = 1}^{N}$

$R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2}$ ¶

$\text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)}$ ¶

$\text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2$ ¶

$\text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2}$ ¶

$\text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2}$ ¶

$\text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)|$ ¶

Residuals¶

Lets first look at Residuals.

A Residual is the difference between the actual value and the predicted value we when evaluate the model.

$y = \text{Actual Value}$ ¶

$\hat y = \text{Predicted Value}$ ¶

$\text {Residual} = (y - \hat y)$ ¶

So for evaluting model performance many of these metrics use some verions of the Residual sum of squares or RSS for short.

$RSS = \sum (y_i - \hat y_i)^2$ ¶

Note: RSS should not be confused with RegSS (Regression Sum of Squares)

Deeper Dive on $R^2$ ¶

The $R^2$ statistic or coefficient of determination is a scale invariant statistic that gives the proportion of variation in target variable explained by the model.

Lets look at the R2 Equation written in a couple of different ways.

$R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2}$ ¶

Total variation in target variable is the sum of squares of the difference between the actual values and their mean. The definition of the Total Sum of Squares (TSS) which gives the total variation in Y is:

$TSS = \sum (y_i - \bar y)^2$ ¶

Given our previous definition of RSS and this definition of TSS we can write the equation as:

$R^2 = 1 - \frac {RSS} {TSS}$ ¶

or in human terms:

$R^2 = 1 - \frac {\text {Unexplained Variation}} {\text {Total Variation}}$ ¶

Adjusted $R^2$ ¶

One of the issues with the $R^2$ metric is that it improves with more independant variables (features). So the more features that are added the better the metric does. This can lead to issues with overfitting. Overfitting is where the model performs well on the training data but then performs poorly on the test data.

In order to address this issue, the Adjusted $R^2$ metric takes into account the number of independent variables used for predicting the target variable.

$\text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)}$ ¶

n = number of data points in the dataset
k: = number of independent variables
$R^2$ = $R^2$ values (see above)

We're not going to cover the Adjusted $R^2$ metric in detail for this notebook.

MSE: Mean Squared Error¶

In statistics, the mean squared error of an estimator measures the mean of the squares of the errors. The MSE is a measure of the quality of an estimator. The measure is a positive value that decreases as the error approaches zero.

$\text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2$ ¶

Advantages:

Easy to compute and understand.
The function is differentiable so it can be used as a loss function for Gradient Descent based algorithms (Neural Networks, Deep Learning Packages).

Disadvantages:

Since the metric computes the square of the error the units aren't the same as the problem domain. For example, differences in dollars results in a metric of dollars squared.
As a square of the error this metric may be senstive to outliers*

***Note:** Depending on the problem domain you may want a metric that penalizes large errors. So this disadvantage may actually be an advantage for some use cases.

RMSE: Root Mean Squared Error¶

RMSE is simply the square root of the MSE Metric. RMSE is certainly one of the most popular evaluation metrics. Many of the Kaggle challenges use RMSE as the model performance metric. RMSE can be directly interpreted in terms of measurement units and is often considered a better measure of fit than a correlation coefficient.

$\text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2}$ ¶

Advantages:

Easy to compute and understand.
Used quite often in Kaggle competitions (they also use MSE and RMSLE in some cases)
The function is differentiable so it can be used as a loss function for Gradient Descent based algorithms (Neural Networks, Deep Learning Packages).

Disadvantages:

As a square of the error this metric may be senstive to outliers*

***Note:** Depending on the problem domain you may want a metric that penalizes large errors. So this disadvantage may actually be an advantage for some use cases.

RMSLE: Root Mean Squared Logistic Error¶

RMSLE is simply the RMSE Metric with a logorithm applied to both the actual and predicted values.

$\text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2}$ ¶

Advantages (compared to RMSE):

Doesn't lend more weight to absolute differences

Disadvantages (compared to RMSE):

Can only be used for positive/non zero target ranges
LogS typically ranges from -13 to +2 :(

Example¶

We have data that ranges from 1 to 10. We have two actual vs predicted values:

Case 1:
- actual: 1.0
- pred: 2.0
Case 2:
- actual: 10
- pred: 11

From a human's prespective a value of 1.0 and a prediction of 2.0 feels 'way off' where a value of 10 and a prediction of 11 feels 'pretty close'. In the case of RMSE these will be the same from a metric penalty perspective, you'd like Case 1 to have a higher penalty. RMSLE will have a higher (relative) penalty for Case 1.

MAE: Mean Absolute Error¶

MAE is also closely related to the MSE Metric. MAE simply takes the absolute value of the error instead of squaring the error. One of the main reasons to use MAE is that it's robust to outliers.

$\text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)|$ ¶

Advantages:

Easy to compute and understand.
Robust to outliers.

Disadvantages:

Not differentiable (so not usable in Neural Nets/Deep Learning)

Which one should I choose?¶

Given the popularity of RMSE and that Kaggle uses it for many of it's competitions, we suggest starting with that one. If you want a metric that robust to outliers then perhaps go with MAE. If you have target variable with a large range (like from 1 to 100), then RMSLE might be the way to go.

Well hopefully this overview of the various metrics for regression model performance was helpful. If you'd like additional details we suggest the follow up reading links and the resources at the top.

If you liked this notebook please visit SCP Labs for more notebooks and examples, or visit our company page for consulting and development services SuperCowPowers

In [1]:

# This cell is simply for adding some CSS (Ignore it :)
from IPython.core.display import HTML
def css_styling():
    styles = open("styles/custom.css", "r").read()
    return HTML(styles)
css_styling()

Out[1]:

Regression Evaluation Metrics¶

Equations¶

$R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2}$ ¶

$\text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)}$ ¶

$\text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2$ ¶

$\text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2}$ ¶

$\text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2}$ ¶

$\text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)|$ ¶

Residuals¶

$y = \text{Actual Value}$ ¶

$\hat y = \text{Predicted Value}$ ¶

$\text {Residual} = (y - \hat y)$ ¶

$RSS = \sum (y_i - \hat y_i)^2$ ¶

Deeper Dive on $R^2$ ¶

$R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2}$ ¶

$TSS = \sum (y_i - \bar y)^2$ ¶

$R^2 = 1 - \frac {RSS} {TSS}$ ¶

$R^2 = 1 - \frac {\text {Unexplained Variation}} {\text {Total Variation}}$ ¶

Suggested Reading:¶

Adjusted $R^2$ ¶

$\text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)}$ ¶

Suggested Reading:¶

MSE: Mean Squared Error¶

$\text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2$ ¶

RMSE: Root Mean Squared Error¶

$\text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2}$ ¶

RMSLE: Root Mean Squared Logistic Error¶

$\text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2}$ ¶

Example¶

MAE: Mean Absolute Error¶

$\text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)|$ ¶

Suggested Reading¶

Which one should I choose?¶

Regression Evaluation Metrics¶

Equations¶

R2=1−∑(yi−ˆyi)2∑(yi−ˉy)2 R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2} ¶

AdjustedR2=1−(1−R2)(n−1)(n−k−1) \text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)} ¶

MSE=1N∑(yi−ˆyi)2 \text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2 ¶

RMSE=√1N∑(yi−ˆyi)2 \text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2} ¶

RMSLE=√1N∑(logyi−logˆyi)2 \text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2} ¶

MAE=1N∑|(yi−ˆyi)| \text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)| ¶

Residuals¶

y=Actual Value y = \text{Actual Value} ¶

ˆy=Predicted Value \hat y = \text{Predicted Value} ¶

Residual=(y−ˆy) \text {Residual} = (y - \hat y) ¶

RSS=∑(yi−ˆyi)2 RSS = \sum (y_i - \hat y_i)^2 ¶

Deeper Dive on R2 R^2 ¶

R2=1−∑(yi−ˆyi)2∑(yi−ˉy)2 R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2} ¶

TSS=∑(yi−ˉy)2 TSS = \sum (y_i - \bar y)^2 ¶

R2=1−RSSTSS R^2 = 1 - \frac {RSS} {TSS} ¶

R2=1−Unexplained VariationTotal Variation R^2 = 1 - \frac {\text {Unexplained Variation}} {\text {Total Variation}} ¶

Suggested Reading:¶

Adjusted R2 R^2 ¶

AdjustedR2=1−(1−R2)(n−1)(n−k−1) \text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)} ¶

Suggested Reading:¶

MSE: Mean Squared Error¶

MSE=1N∑(yi−ˆyi)2 \text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2 ¶

RMSE: Root Mean Squared Error¶

RMSE=√1N∑(yi−ˆyi)2 \text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2} ¶

RMSLE: Root Mean Squared Logistic Error¶

RMSLE=√1N∑(logyi−logˆyi)2 \text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2} ¶

Example¶

MAE: Mean Absolute Error¶

MAE=1N∑|(yi−ˆyi)| \text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)| ¶

Suggested Reading¶

Which one should I choose?¶

$R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2}$ ¶

$\text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)}$ ¶

$\text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2$ ¶

$\text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2}$ ¶

$\text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2}$ ¶

$\text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)|$ ¶

$y = \text{Actual Value}$ ¶

$\hat y = \text{Predicted Value}$ ¶

$\text {Residual} = (y - \hat y)$ ¶

$RSS = \sum (y_i - \hat y_i)^2$ ¶

Deeper Dive on $R^2$ ¶

$R^2 = 1 - \frac {\sum (y_i - \hat y_i)^2} {\sum (y_i - \bar y)^2}$ ¶

$TSS = \sum (y_i - \bar y)^2$ ¶

$R^2 = 1 - \frac {RSS} {TSS}$ ¶

$R^2 = 1 - \frac {\text {Unexplained Variation}} {\text {Total Variation}}$ ¶

Adjusted $R^2$ ¶

$\text{Adjusted} R^2 = 1 - \frac {(1-R^2)(n-1)} {(n - k - 1)}$ ¶

$\text{MSE} = \frac{1}{N} \sum (y_i - \hat y_i)^2$ ¶

$\text{RMSE} = \sqrt{\frac{1}{N} \sum (y_i - \hat y_i)^2}$ ¶

$\text{RMSLE} = \sqrt{\frac{1}{N} \sum (\log y_i - \log \hat y_i)^2}$ ¶

$\text{MAE} = \frac{1}{N} \sum |(y_i - \hat y_i)|$ ¶