Plug-in Approach to Active Learning

Series
Stochastics Seminar
Time
Thursday, March 3, 2011 - 3:05pm for 1 hour (actually 50 minutes)
Location
Skiles 005
Speaker
Stas Minsker – Georgia Tech
Organizer
Yuri Bakhtin
 Let (X,Y) be a random couple with unknown distribution P, X being an observation and Y - a binary label to be predicted. In practice, distribution P remains unknown but the learning algorithm has access to the training data - the sample from P. It often happens that the cost of obtaining the training data is associated with labeling the observations while the pool of observations itself is almost unlimited. This suggests to measure the performance of a learning algorithm in terms of its label complexity, the number of labels required to obtain a classifier with the desired accuracy. Active Learning theory explores the possible advantages of this modified framework.We will present a new active learning algorithm based on nonparametric estimators of the regression function and explain main improvements over the previous work.Our investigation provides upper and lower bounds for the performance of proposed method over a broad class of underlying distributions.