Why is Adjusted R-Squared preferred over R-Squared?

Photo by Karolina Grabowska: https://www.pexels.com/photo/photo-of-person-using-ruler-4219524/

R-Squared explains the linear relationship between the independent variables and the dependent variable, which is the sum of squared errors divided by the total sum of squares.

Where SST is the sum of squares total, SSR is the sum of squares regression and SSE is the sum of squares error.

https://365datascience.com/tutorials/statistics-tutorials/sum-squares/
https://365datascience.com/tutorials/statistics-tutorials/sum-squares/
https://365datascience.com/tutorials/statistics-tutorials/sum-squares/

As SST is fixed, R-Squared always increases when the number of variables increases, even if you add a non-significant variable to the model. This is because SSR rises when the number of variables is added, making SSE decrease.

This overestimation is corrected by using the Adjusted R-Squared, as it only increases if the new variable improves the model. It will be always less than R-Squared and equal to R-Squared for one predictor variable.

--

--