The term “naive” in “naive Bayes” refers to the assumption of independence among features in the dataset. It is considered “naive” because it simplifies the model by assuming that all features are independent of each other given the class label. In reality, this assumption might not hold true for many real-world datasets, as features often exhibit some level of correlation or dependence.
Despite its simplicity and the simplifying assumption, naive Bayes often performs surprisingly well in practice, especially in text classification and other similar tasks. Its simplicity makes it computationally efficient and easy to implement, and it can still provide reasonably good results in many scenarios, particularly when the independence assumption is approximately met or when there is limited training data available.
In summary, naive Bayes is “naive” because of its simplifying assumption of feature independence, but this simplicity contributes to its efficiency and effectiveness in many practical applications.