Vectors, Sampling and Massive Data

ACO Distinguished Lecture
Tuesday, November 1, 2011 - 16:30
1 hour (actually 50 minutes)
Klaus 1116
Microsoft Research India

There will be a reception in the Atrium of the Klaus building at 4PM.

Modeling data as high-dimensional (feature) vectors is a staple in Computer Science, its use in ranking web pages reminding us again of its effectiveness. Algorithms from Linear Algebra (LA) provide a crucial toolkit. But, for modern problems with massive data, these algorithms may take too long. Random sampling to reduce the size suggests itself. I will give a from-first-principles description of the LA connection, then discuss sampling techniques developed over the last decade for vectors, matrices and graphs. Besides saving time, sampling leads to sparsification and compression of data. Speaker's bio