Generalization error

For supervised learning applications in machine learning and statistical learning theory, generalization error[1] (also known as the out-of-sample error[2] or the risk) is a measure of how accurately an algorithm is able to predict outcome values for previously unseen data. Because learning algorithms are evaluated on finite samples, the evaluation of a learning algorithm may be sensitive to sampling error. As a result, measurements of prediction error on the current data may not provide much information about predictive ability on new data. Generalization error can be minimized by avoiding overfitting in the learning algorithm. The performance of a machine learning algorithm is visualized by plots that show values of estimates of the generalization error through the learning process, which are called learning curves.

  1. ^ Mohri, M., Rostamizadeh A., Talwakar A., (2018) Foundations of Machine learning, 2nd ed., Boston: MIT Press
  2. ^ Y S. Abu-Mostafa, M.Magdon-Ismail, and H.-T. Lin (2012) Learning from Data, AMLBook Press. ISBN 978-1600490064

© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search