R-squared measures the proportion of variation in the dependent variables explained by the independent variables.
Adjusted R-squared gives the percentage of variation explained by those independent variables that in reality affect the dependent variable.
R-squared (R2) and adjusted R-squared are both metrics used to evaluate the goodness of fit of a regression model. However, they have different interpretations and purposes:
- R-squared (R2):
- R-squared is a measure of how well the independent variables in a regression model explain the variability of the dependent variable.
- It ranges from 0 to 1, where 0 indicates that the independent variables do not explain any of the variability of the dependent variable, and 1 indicates that they explain all of the variability.
- R-squared increases as you add more independent variables to the model, even if they are not statistically significant, which can lead to overfitting.
- Adjusted R-squared:
- Adjusted R-squared adjusts the R-squared value for the number of predictors in the model.
- It penalizes the addition of unnecessary predictors that do not significantly improve the model’s fit.
- Adjusted R-squared takes into account the degrees of freedom, or the number of predictors and the sample size, to provide a more accurate estimate of the model’s predictive power.
- Unlike R-squared, adjusted R-squared can decrease as you add more predictors if those predictors do not improve the model significantly.
In summary, while R-squared gives an overall measure of how well the independent variables explain the variability of the dependent variable, adjusted R-squared provides a more conservative measure that adjusts for the number of predictors in the model, thus helping to guard against overfitting. In general, adjusted R-squared is considered a more reliable metric when comparing models with different numbers of predictors.