Friday, February 15, 2013 - 14:00
1 hour (actually 50 minutes)
Hosted by the School of Computational Science and Engineering
As the term "big data'' appears more and more frequently in our daily life and research activities, it changes our knowledge of how large the scale of the data can be and challenges the application of numerical analysis for performing statistical calculations. In this talk, I will focus on two basic statistics problems sampling a multivariate normal distribution and maximum likelihood estimation and illustrate the scalability issue that dense numerical linear algebra techniques are facing. The large-scale challenge motivates us to develop scalable methods for dense matrices, commonly seen in statistical analysis. I will present several recent developments on the computations of matrix functions and on the solution of a linear system of equations, where the matrices therein are large-scale, fully dense, but structured. The driving ideas of these developments are the exploration of the structures and the use of fast matrix-vector multiplications to reduce the general quadratic cost in storage and cubic cost in computation. "Big data'' offers a fresh opportunity for numerical analysts to develop algorithms with a central goal of scalability in mind. It also brings in a new stream of requests to high performance computing for highly parallel codes accompanied with the development of numerical algorithms. Scalable and parallelizable methods are key for convincing statisticians and practitioners to apply the powerful statistical theories on large-scale data that they currently feel uncomfortable to handle.