Excess Risk Bounds in Binary Classification

Stochastics Seminar
Thursday, April 16, 2009 - 15:00
1 hour (actually 50 minutes)
Skiles 269
School of Mathematics, Georgia Tech
In binary classification problems, the goal is to estimate a function g*:S -> {-1,1} minimizing the generalization error (or the risk) L(g):=P{(x,y):y \neq g(x)}, where P is a probability distribution in S x {-1,1}. The distribution P is unknown and estimators \hat g of g* are based on a finite number of independent random couples (X_j,Y_j) sampled from P. It is of interest to have upper bounds on the excess risk {\cal E}(\hat g):=L(\hat g) - L(g_{\ast}) of such estimators that hold with a high probability and that take into account reasonable measures of complexity of classification problems (such as, for instance, VC-dimension). We will discuss several approaches (both old and new) to excess risk bounds in classification, including some recent results on excess risk in so called active learning.