Estimating PageRank on Graph Streams

ACO Student Seminar
Wednesday, October 8, 2008 - 13:30
2 hours
Skiles 269
CS/ACO, Georgia Tech
This study focuses on computations on large graphs (e.g., the web-graph) where the edges of the graph are presented as a stream. The objective in the streaming model is to maintain small amount of memory and perform few passes over the data. In the streaming model, we show how to perform several graph computations including estimating the probability distribution after a random walk of certain length l, estimate the mixing time, and the conductance. We can compute the approximate PageRank values in O(nM^{-1/4}) space and O(M^{3/4}) passes (where n is the number of nodes and M is the mixing time of the graph). In comparison, a standard (matrix-vector multiplication) implementation of the PageRank algorithm will take O(n) space and O(M) passes. The main ingredient in all our algorithms is to explicitly perform several random walks of certain length efficiently in the streaming model. I shall define and motivate the streaming model and the notion of PageRank, and describe our results and techniques. Joint work with Sreenivas Gollapudi and Rina Panigrahy from Microsoft Research.