Overfitting is one of the biggest problems in machine learning models. It happens when a model is too ajusted to the training dataset. So that, it lose performance when predicting on new dataset. In this article I will present a simple overview and how we can avoid this problem.
A machine learning model is constructed on using a sample dataset. However, this dataset sometimes does not represent the whole scenario of the data population. Therefore a model constructed could present high variance so it will tend to overfitted becoming too ajusted to data presented on the dataset. Leading to potencial problems when applied to new dataset.
How to avoid.
There are some ways to prevent overfitting in our model by:
- adding more data.
- reducing number of features.
- dealing with Regularization.
- early stopping.