Descriptions and analysis of popular evalution metrics for measuring performance of regresssion models.
In general the most common metrics for regression model evaluation are:
Resources for More information:
Lets first look at Residuals.
A Residual is the difference between the actual value and the predicted value we when evaluate the model.
So for evaluting model performance many of these metrics use some verions of the Residual sum of squares or RSS for short.
Note: RSS should not be confused with RegSS (Regression Sum of Squares)
The R2 statistic or coefficient of determination is a scale invariant statistic that gives the proportion of variation in target variable explained by the model.
Lets look at the R2 Equation written in a couple of different ways.
Total variation in target variable is the sum of squares of the difference between the actual values and their mean. The definition of the Total Sum of Squares (TSS) which gives the total variation in Y is:
Given our previous definition of RSS and this definition of TSS we can write the equation as:
or in human terms:
One of the issues with the R2 metric is that it improves with more independant variables (features). So the more features that are added the better the metric does. This can lead to issues with overfitting. Overfitting is where the model performs well on the training data but then performs poorly on the test data.
In order to address this issue, the Adjusted R2 metric takes into account the number of independent variables used for predicting the target variable.
We're not going to cover the Adjusted R2 metric in detail for this notebook.
In statistics, the mean squared error of an estimator measures the mean of the squares of the errors. The MSE is a measure of the quality of an estimator. The measure is a positive value that decreases as the error approaches zero.
Advantages:
Disadvantages:
***Note:** Depending on the problem domain you may want a metric that penalizes large errors. So this disadvantage may actually be an advantage for some use cases.
RMSE is simply the square root of the MSE Metric. RMSE is certainly one of the most popular evaluation metrics. Many of the Kaggle challenges use RMSE as the model performance metric. RMSE can be directly interpreted in terms of measurement units and is often considered a better measure of fit than a correlation coefficient.
Advantages:
Disadvantages:
***Note:** Depending on the problem domain you may want a metric that penalizes large errors. So this disadvantage may actually be an advantage for some use cases.
RMSLE is simply the RMSE Metric with a logorithm applied to both the actual and predicted values.
Advantages (compared to RMSE):
Disadvantages (compared to RMSE):
We have data that ranges from 1 to 10. We have two actual vs predicted values:
From a human's prespective a value of 1.0 and a prediction of 2.0 feels 'way off' where a value of 10 and a prediction of 11 feels 'pretty close'. In the case of RMSE these will be the same from a metric penalty perspective, you'd like Case 1 to have a higher penalty. RMSLE will have a higher (relative) penalty for Case 1.
MAE is also closely related to the MSE Metric. MAE simply takes the absolute value of the error instead of squaring the error. One of the main reasons to use MAE is that it's robust to outliers.
Advantages:
Disadvantages:
Given the popularity of RMSE and that Kaggle uses it for many of it's competitions, we suggest starting with that one. If you want a metric that robust to outliers then perhaps go with MAE. If you have target variable with a large range (like from 1 to 100), then RMSLE might be the way to go.
Well hopefully this overview of the various metrics for regression model performance was helpful. If you'd like additional details we suggest the follow up reading links and the resources at the top.
If you liked this notebook please visit SCP Labs for more notebooks and examples, or visit our company page for consulting and development services SuperCowPowers
# This cell is simply for adding some CSS (Ignore it :)
from IPython.core.display import HTML
def css_styling():
styles = open("styles/custom.css", "r").read()
return HTML(styles)
css_styling()