What is a good metric for measuring the level of multicollinearity?

VIF or 1/tolerance is a good measure of measuring multicollinearity in models. VIF is the percentage of the variance of a predictor which remains unaffected by other predictors. So higher the VIF value, greater is the multicollinearity amongst the predictors. A rule of thumb for interpreting the variance inflation factor: 1 = not correlated. Between … Read more

Which type of sampling is better for a classification model and why?

Stratified sampling is better in case of classification problems because it takes into account the balance of classes in train and test sets. The proportion of classes is maintained and hence the model performs better. In case of random sampling of data, the data is divided into two parts without taking into consideration the balance … Read more

If we have a high bias error what does it mean? How to treat it?

High bias error means that that model we are using is ignoring all the important trends in the model and the model is underfitting. To reduce underfitting: We need to increase the complexity of the model Number of features need to be increased Sometimes it also gives the impression that the data is noisy. Hence … Read more

What ensemble technique is used by gradient boosting trees?

Boosting is the technique used by GBM. The ensemble technique used by gradient boosting trees is known as “boosting.” Gradient boosting is an ensemble learning method that combines the predictions of multiple weak learners, typically decision trees, to create a strong predictive model. In the case of gradient boosting trees, each tree is built sequentially, … Read more

What ensemble technique is used by Random forests?

Bagging is the technique used by Random Forests. Random forests are a collection of trees which work on sampled data from the original dataset with the final prediction being a voted average of all trees.   The correct answer to the question “What ensemble technique is used by Random Forests?” is: Random Forests use the … Read more