PAC Learning

In the last chapter, we have shown that for a finite hypothesis class , as long as we have large enough sample (), then applying ERM will output a hypothesis that is probably (at least ) approximately correct (general loss ). More generally, we define Probably Approximately Correct (PAC) learning as:

THEOREM 3.1 (PAC learnability) A hypothesis class is PAC learnable if there exists a function and a learning algorithm with the following property: For every , for every distribution over , and for every labeling function , if the realizable assumption holds with respect to , then running the algorithm on i.i.d examples generated by and labeled by will get a hypothesis such that, with probability of at least (over the choice of all possible -tuple sample), (general loss bounded).

  • In short, if the hypothesis class you choose is PAC learnable, then increasing sample size guarantee you to have reliable bounded general loss.

  • Quantitatively, the sample complexity (number of sample size needed) is determined by controlled by two parameters, and .

  • The accuracy parameter measures how far the output classifier can be from the optimal one (this corresponds to the "approximately correct").

  • The confidence parameter indicates how likely the classifier is to meet that accuracy requirement (corresponds to the "probably" part of “PAC”).

  • The use of is inevitable since for finite sample, there is always a small chance that the sample data is non-representative for the underlying distribution (e.g., the sample contains only one data point).

  • In addition to and , also depends on the property of

  • All finite hypothesis classes are PAC learnable and

  • There are also PAC learnable infinite hypothesis classes and the PAC learnability is not determined by fitness.

results matching ""

    No results matching ""