Error is a sum of bias error+variance error+ irreducible error in regression. Bias and variance error can be reduced but not the irreducible error.
In regression, the error term (also known as the residual or residual error) represents the difference between the observed values of the dependent variable and the values predicted by the regression model. It is essentially the deviation of each data point from the regression line. The error term is composed of several components:
- Random Error (ε): This component represents the unpredictable variability in the dependent variable that cannot be explained by the independent variables included in the model. It captures the influence of unobserved factors or measurement errors.
- Systematic Error (Bias): Systematic error represents any consistent, non-random discrepancy between the observed and true values. It could be due to model misspecification or the omission of relevant variables in the regression model.
The overall error term in a regression model is expressed as:
��=�0+�1��1+�2��2+…+�����+��Yi=β0+β1Xi1+β2Xi2+…+βkXik+εi
where:
- ��Yi is the observed dependent variable for the ith observation.
- �0β0 is the intercept term.
- �1,�2,…,��β1,β2,…,βk are the coefficients of the independent variables.
- ��1,��2,…,���Xi1,Xi2,…,Xik are the values of the independent variables for the ith observation.
- ��εi is the error term for the ith observation.
In a well-fitted regression model, the goal is to minimize the sum of squared residuals (sum of squared errors), indicating that the model provides the best fit to the observed data.