To verify how well statistical learning machines have learned a phenomenon, a step called model evaluation is carried out. This aims to evaluate whether the model (the mathematical abstraction learned from specific instances) can generalize well.

Overfitting

A model may not generalize well if it overfits the data [1], which occurs when
the model becomes sensitive to the noise in the data set. Below, we illustrate
on a concrete example the concept of model overfitting.

Learning word-by-word and not the general concepts

A practical example of overfitting would be that of a student who tries very
hard to learn every words in a textbook by heart but who fails to pick up the
general concepts taught by the textbook.

Reference

  1. Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., New York, NY, USA, 2001.
  2. Yeom H, Kim J, Chung C Creative Commons Attribution 4.0 International