A data analyst interview question and answers guide will not complete without this question. An outlier is a term commonly used by data analysts when referring to a value that appears to be far removed and divergent from a set pattern in a sample. There are two kinds of outliers – Univariate and Multivariate.
The two methods used for detecting outliers are:
- Box plot method – According to this method, if the value is higher or lesser than 1.5*IQR (interquartile range), such that it lies above the upper quartile (Q3) or below the lower quartile (Q1), the value is an outlier.
- Standard deviation method – This method states that if a value is higher or lower than mean ± (3*standard deviation), it is an outlier.