Plug-in Approach to Active Learning

Series: Stochastics Seminar
Time: Thursday, March 3, 2011 - 3:05pm for 1 hour (actually 50 minutes)
Location: Skiles 005
Speaker: Stas Minsker – Georgia Tech
Organizer: Yuri Bakhtin

Let (X,Y) be a random couple with unknown distribution P, X being an observation and Y - a binary label to be predicted. In practice, distribution P remains unknown but the learning algorithm has access to the training data - the sample from P. It often happens that the cost of obtaining the training data is associated with labeling the observations while the pool of observations itself is almost unlimited. This suggests to measure the performance of a learning algorithm in terms of its label complexity, the number of labels required to obtain a classifier with the desired accuracy. Active Learning theory explores the possible advantages of this modified framework.We will present a new active learning algorithm based on nonparametric estimators of the regression function and explain main improvements over the previous work.Our investigation provides upper and lower bounds for the performance of proposed method over a broad class of underlying distributions.

Georgia Institute of Technology College of Sciences

Search form