Suppose you found that your model is suffering from low bias and high variance. Which algorithm you think could tackle this situation and Why?

Type 1: How to tackle high variance?

  • Low bias occurs when the model’s predicted values are near to actual values.
  • In this case, we can use the bagging algorithm (eg: Random Forest) to tackle high variance problem.
  • Bagging algorithm will divide the data set into its subsets with repeated randomized sampling.
  • Once divided, these samples can be used to generate a set of models using a single learning algorithm. Later, the model predictions are combined using voting (classification) or averaging (regression).
    Type 2: How to tackle high variance?
  • Lower the model complexity by using regularization technique, where higher model coefficients get penalized.
  • You can also use top n features from variable importance chart. It might be possible that with all the variable in the data set, the algorithm is facing difficulty in finding the meaningful signal.