Statistical learning theory was developed during 1960-1990 mainly by Vladimir Vapnik and Alexey Chervonenkis. The theory explains the learning process from a statistical point of view.

This foundational theory unifies such disparate algorithms such as neural networks, principal components analysis, and maximum likelihood.

The theory covers four parts (extracted from "The Nature of Statistical Learning Theory"):

  • Theory of consistency of learning processes
    • What are (necessary and sufficient) conditions for consistency of a learning process based on the empirical risk minimization principle ?
  • Nonasymptotic theory of the rate of convergence of learning processes
    • How fast is the rate of convergence of the learning process?
  • Theory of controlling the generalization ability of learning processes
    • How can one control the rate of convergence (the generalization ability) of the learning process?
  • Theory of constructing learning machines
    • How can one construct algorithms that can control the generalization ability?

The last part of the theory introduced an well-known learning algorithm: the support vector machine.

Statistical learning theory contains important concepts such as the VC dimension and structural risk minimization. This theory is foundation of a real understanding of machine learning.

This theory is related to mathematical subjects such as:

References