- You are here:
- GT Home
- Home
- News & Events

Series: Job Candidate Talk

The mean field variational inference is widely used in statistics and
machine learning to approximate posterior distributions. Despite its
popularity, there exist remarkably little fundamental theoretical
justifications. The success of variational inference
mainly lies in its iterative algorithm, which, to the best of our
knowledge, has never been investigated for any high-dimensional or
complex model. In this talk, we establish computational and statistical
guarantees of mean field variational inference. Using
community detection problem as a test case, we show that its iterative
algorithm has a linear convergence to the optimal statistical accuracy
within log n iterations. We are optimistic to go beyond community
detection and to understand mean field under a general
class of latent variable models. In addition, the technique we develop
can be extended to analyzing Expectation-maximization and Gibbs sampler.

Series: Job Candidate Talk

I will discuss a recent line of research that uses properties of real rooted polynomials to get quantitative estimates in combinatorial linear algebra problems. I will start by discussing the main result that bridges the two areas (the "method of interlacing polynomials") and show some examples of where it has been used successfully (e.g. Ramanujan families and the Kadison Singer problem). I will then discuss some more recent work that attempts to make the method more accessible by providing generic tools and also attempts to explain the accuracy of the method by linking it to random matrix theory and (in particular) free probability. I will end by mentioning some current research initiatives as well as possible future directions.

Series: Job Candidate Talk

Convex-geometric methods, involving random projection operators and coverings, have been successfully used in the study of the largest and smallest singular values, delocalization of eigenvectors,
and in establishing the limiting spectral distribution for certain random matrix models. Among
further applications of those methods in computer science and statistics are restricted invertibility
and dimension reduction, as well as approximation of covariance matrices of multidimensional distributions. Conversely, random linear operators play a very important role in geometric functional
analysis. In this talk, I will discuss some recent results (by my collaborators and myself) within convex geometry and the theory of random matrices, focusing on invertibility of square non-Hermitian
random matrices (with applications to numerical analysis and the study of the limiting spectral
distribution of directed d-regular graphs), approximation of covariance matrices (in particular, a
strengthening of the Bai–Yin theorem), as well as some applications of random operators in convex
geometry.

Series: Job Candidate Talk

Semiparametric regressions enjoy the flexibility of nonparametric models as well as the in-terpretability of linear models. These advantages can be further leveraged with recent ad-vance in high dimensional statistics. This talk begins with a simple partially linear model,Yi = Xi β ∗ + g ∗ (Zi ) + εi , where the parameter vector of interest, β ∗ , is high dimensional butsufficiently sparse, and g ∗ is an unknown nuisance function. In spite of its simple form, this highdimensional partially linear model plays a crucial role in counterfactual studies of heterogeneoustreatment effects. In the first half of this talk, I present an inference procedure for any sub-vector (regardless of its dimension) of the high dimensional β ∗ . This method does not requirethe “beta-min” condition and also works when the vector of covariates, Zi , is high dimensional,provided that the function classes E(Xij |Zi )s and E(Yi |Zi ) belong to exhibit certain sparsityfeatures, e.g., a sparse additive decomposition structure. In the second half of this talk, I discussthe connections between semiparametric modeling and Rubin’s Causal Framework, as well asthe applications of various methods (including the one from the first half of this talk and thosefrom my other papers) in counterfactual studies that are enriched by “big data”.Abstract as a .pdf

Series: Job Candidate Talk

We study the convergence rate of the least squares estimator (LSE) in a regression model with possibly heavy-tailed errors. Despite its importance in practical applications, theoretical understanding of this problem has been limited. We first show that from a worst-case perspective, the convergence rate of the LSE in a general non-parametric regression model is given by the maximum of the Gaussian regression rate and the noise rate induced by the errors. In the more difficult statistical model where the errors only have a second moment, we further show that the sizes of the 'localized envelopes' of the model give a sharp interpolation for the convergence rate of the LSE between the worst-case rate and the (optimal) parametric rate. These results indicate both certain positive and negative aspects of the LSE as an estimation procedure in a heavy-tailed regression setting. The key technical innovation is a new multiplier inequality that sharply controls the size of the multiplier empirical process associated with the LSE, which also finds applications in shape-restricted and sparse linear regression problems.

Series: Job Candidate Talk

Random effects models are commonly used to measure genetic
variance-covariance matrices of quantitative phenotypic traits. The
population eigenvalues of these matrices describe the evolutionary
response to selection. However, they may be difficult to estimate from
limited samples when the number of traits is large. In this talk, I will
present several results describing the eigenvalues of classical MANOVA
estimators of these matrices, including dispersion of the bulk
eigenvalue distribution, bias and aliasing of large "spike" eigenvalues,
and distributional limits of eigenvalues at the spectral edges. I will
then discuss a new procedure that uses these results to obtain better
estimates of the large population eigenvalues when there are many
traits, and a Tracy-Widom test for detecting true principal components
in these models. The theoretical results extend proof techniques in
random matrix theory and free probability, which I will also briefly
describe.This is joint work with Iain Johnstone, Yi Sun, Mark Blows, and Emma Hine.

Series: Job Candidate Talk

We consider the problem of
estimating pairwise comparison probabilities in a tournament setting
after observing every pair of teams play with each other once. We assume
the true pairwise probability matrix satisfies a stochastic
transitivity condition which
is popular in the Social Sciences.This stochastic transitivity
condition generalizes the ubiquitous Bradley- Terry model used in the
ranking literature. We propose a computationally efficient estimator for
this problem, borrowing ideas from recent work on
Shape Constrained Regression. We show that the worst case rate of our
estimator matches the best known rate for computationally tractable
estimators. Additionally we show our estimator enjoys faster rates of
convergence for several sub parameter spaces of
interest thereby showing automatic adaptivity. We also study the
missing data setting where only a fraction of all possible games are
observed at random.

Series: Job Candidate Talk

In this talk, I will discuss some
examples of sparse signal detection problems in the context of binary outcomes.
These will be motivated by examples from next generation sequencing association
studies, understanding heterogeneities in large scale networks, and exploring
opinion distributions over networks. Moreover, these examples will serve as
templates to explore interesting phase transitions present in such studies. In
particular, these phase transitions will be aimed at revealing a difference
between studies with possibly dependent binary outcomes and Gaussian outcomes.
The theoretical developments will be further complemented with numerical
results.

Series: Job Candidate Talk

Biology is becoming increasingly quantitative, with large genomic datasets being curated at a rapid rate. Sound mathematical modeling as well as data science approaches are both needed to take advantage of these newly available datasets. I will describe two projects that span these approaches. The first is a Markov chain model of naturalselection acting at two scales, motivated by the virulence-transmission tradeoff from pathogen evolution. This stochastic model, under a natural scaling, converges to a nonlinear deterministic system for which we can analytically derive steady-state behavior. This analysis, along with simulations, leads to general properties of selection at two scales. The second project is a bioinformatics pipeline that identifies gene copy number variants, currently a difficult problem in modern genomics. This quantificationof copy number variation in turn generates new mathematical questionsthat require the type of probabilistic modelling used in the first project.

Series: Job Candidate Talk

High-dimensional data arise in many fields of contemporary science and introduce new challenges in statistical learning due to the well-known curse of dimensionality. Many data sets in image analysis and signal processing are in a high-dimensional space but exhibit a low-dimensional structure. We are interested in building efficient representations of these data for the purpose of compression and inference, and giving performance guarantees that are only cursed by the intrinsic dimension of data. Specifically, in the setting where a data set in $R^D$ consists of samples from a probability measure concentrated on or near an unknown $d$-dimensional manifold with $d$ much smaller than $D$, we consider two sets of problems: low-dimensional geometric approximation to the manifold and regression of a function on the manifold. In the first case we construct multiscale low-dimensional empirical approximations to the manifold and give finite-sample performance guarantees. In the second case we exploit these empirical geometric approximations of the manifold to construct multiscale approximations to the function. We prove finite-sample guarantees showing that we attain the same learning rates as if the function was defined on a Euclidean domain of dimension $d$. In both cases our approximations can adapt to the regularity of the manifold or the function even when this varies at different scales or locations. All algorithms have complexity $C n\log (n)$ where $n$ is the number of samples, and the constant $C$ is linear in $D$ and exponential in $d$.