Statistical Algorithms and a Lower Bound for Detecting a Planted Clique

Series
High-Dimensional Phenomena in Statistics and Machine Learning Seminar
Time
Tuesday, October 2, 2012 - 3:05pm for 1.5 hours (actually 80 minutes)
Location
Skyles 005
Speaker
Santosh Vempala – Georgia Institute of Technology
Organizer
Karim Lounici
We present a framework for proving lower bounds on computational problems over distributions, including optimization and unsupervised learning. The framework is based on defining a restricted class of algorithms, called statistical algorithms, that instead of directly accessing samples from the input distribution can only obtain an estimate of the expectation of any given function on the input distribution. Our definition captures many natural algorithms used in theory and practice, e.g., moment-based methods, local search, linear programming, MCMC and simulated annealing. Our techniques are inspired by the statistical query model of Kearns from learning theory, which addresses the complexity of PAC-learning. For specific well-known problems over distributions, we obtain lower bounds on the complexity of any statistical algorithm. These include an exponential lower bounds for moment maximization and a nearly optimal lower bound for detecting planted clique distributions when the planted clique has size n^{1/2-\delta} for any constant \delta > 0. Variants of the latter problem have been assumed to be hard to prove hardness for other problems and for cryptographic applications. Our lower bounds provide concrete evidence of hardness. This is joint work with V. Feldman, E. Grigorescu, L. Reyzin and Y. Xiao.