Seminars and Colloquia by Series

Series

Sampling Approximately Low-Rank Ising Models: MCMC meets Variational Methods

Series: Applied and Computational Mathematics Seminar
Time: Monday, April 18, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: Hybrid: Skiles 005 and https://gatech.zoom.us/j/96551543941
Speaker: Holden Lee – Duke University – holden.lee@duke.edu

MCMC and variational inference are two competing paradigms for the problem of sampling from a given probability distribution. In this talk, I'll show how they can work together to give the first polynomial-time sampling algorithm for approximately low-rank Ising models. Sampling was previously known when all eigenvalues of the interaction matrix fit in an interval of length 1; however, a single outlier can cause Glauber dynamics to mix torpidly. Our result covers the case when all but O(1) eigenvalues lie in an interval of length 1. To deal with positive eigenvalues, we use a temperature-based heuristic for MCMC called simulated tempering, while to deal with negative eigenvalues, we define a nonconvex variational problem over Ising models, solved using SGD. Our result has applications to sampling Hopfield networks with a fixed number of patterns, Bayesian clustering models with low-dimensional contexts, and antiferromagnetic/ferromagnetic Ising model on expander graphs.

Learning Operators with Coupled Attention

Series: Applied and Computational Mathematics Seminar
Time: Monday, April 11, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: https://gatech.zoom.us/j/96551543941
Speaker: Paris Perdikaris – University of Pennsylvania – pgp@seas.upenn.edu

Supervised operator learning is an emerging machine learning paradigm with applications to modeling the evolution of spatio-temporal dynamical systems and approximating general black-box relationships between functional data. We propose a novel operator learning method, LOCA (Learning Operators with Coupled Attention), motivated from the recent success of the attention mechanism. In our architecture, the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations. By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions, enabling us to approximate nonlinear operators even when the number of output function measurementsin the training set is very small. Our formulation is accompanied by rigorous approximation theoretic guarantees on the universal expressiveness of the proposed model. Empirically, we evaluate the performance of LOCA on several operator learning scenarios involving systems governed by ordinary and partial differential equations, as well as a black-box climate prediction problem. Through these scenarios we demonstrate state of the art accuracy, robustness with respect to noisy input data, and a consistently small spread of errors over testing data sets, even for out-of-distribution prediction tasks.

The Approximation Properties of Convex Hulls, Greedy Algorithms, and Applications to Neural Networks

Series: Applied and Computational Mathematics Seminar
Time: Monday, April 4, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: Hybrid: Skiles 005 and https://gatech.zoom.us/j/96551543941
Speaker: Jonathan Siegel – Penn State Mathematics Department – jus1949@psu.edu

Given a collection of functions in a Banach space, typically called a dictionary in machine learning, we study the approximation properties of its convex hull. Specifically, we develop techniques for bounding the metric entropy and n-widths, which are fundamental quantities in approximation theory that control the limits of linear and non-linear approximation. Our results generalize existing methods by taking the smoothness of the dictionary into account, and in particular give sharp estimates for shallow neural networks. Consequences of these results include: the optimal approximation rates which can be attained for shallow neural networks, that shallow neural networks dramatically outperform linear methods of approximation, and indeed that shallow neural networks outperform all continuous methods of approximation on the associated convex hull. Next, we discuss greedy algorithms for constructing approximations by non-linear dictionary expansions. Specifically, we give sharp rates for the orthogonal greedy algorithm for dictionaries with small metric entropy, and for the pure greedy algorithm. Finally, we give numerical examples showing that greedy algorithms can be used to solve PDEs with shallow neural networks.

How Differential Equations Insight Benefit Deep Learning

Series: Applied and Computational Mathematics Seminar
Time: Monday, March 28, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: https://gatech.zoom.us/j/96551543941 (note: Zoom, not Bluejeans)
Speaker: Prof. Bao Wang – University of Utah – bwang@math.utah.edu

We will present a new class of continuous-depth deep neural networks that were motivated by the ODE limit of the classical momentum method, named heavy-ball neural ODEs (HBNODEs). HBNODEs enjoy two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers, thus significantly accelerate learning and improve the utility of the trained models. (ii) The spectrum of HBNODEs is well structured, enabling effective learning of long-term dependencies from complex sequential data.

Second, we will extend HBNODE to graph learning leveraging diffusion on graphs, resulting in new algorithms for deep graph learning. The new algorithms are more accurate than existing deep graph learning algorithms and more scalable to deep architectures, and also suitable for learning at low labeling rate regimes. Moreover, we will present a fast multipole method-based efficient attention mechanism for modeling graph nodes interactions.

Third, if time permits, we will discuss proximal algorithms for accelerating learning continuous-depth neural networks.

Low-dimensional Modeling for Deep Learning

Series: Applied and Computational Mathematics Seminar
Time: Monday, March 14, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: https://gatech.zoom.us/j/96551543941
Speaker: Zhihui Zhu – University of Denvor – zhihui.zhu@du.edu

In the past decade, the revival of deep neural networks has led to dramatic success in numerous applications ranging from computer vision to natural language processing to scientific discovery and beyond. Nevertheless, the practice of deep networks has been shrouded with mystery as our theoretical understanding of the success of deep learning remains elusive.

In this talk, we will exploit low-dimensional modeling to help understand and improve deep learning performance. We will first provide a geometric analysis for understanding neural collapse, an intriguing empirical phenomenon that persists across different neural network architectures and a variety of standard datasets. We will utilize our understanding of neural collapse to improve training efficiency. We will then exploit principled methods for dealing with sparsity and sparse corruptions to address the challenges of overfitting for modern deep networks in the presence of training data corruptions. We will introduce a principled approach for robustly training deep networks with noisy labels and robustly recovering natural images by deep image prior.

Symmetry-preserving machine learning for computer vision, scientific computing, and distribution learning

Series: Applied and Computational Mathematics Seminar
Time: Monday, March 7, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: https://gatech.zoom.us/j/96551543941 (note: Zoom, not Bluejeans)
Speaker: Prof. Wei Zhu – UMass Amherst

Please Note: Note the talk will be hosted by Zoom, not Bluejeans any more.

Symmetry is ubiquitous in machine learning and scientific computing. Robust incorporation of symmetry prior into the learning process has shown to achieve significant model improvement for various learning tasks, especially in the small data regime.

In the first part of the talk, I will explain a principled framework of deformation-robust symmetry-preserving machine learning. The key idea is the spectral regularization of the (group) convolutional filters, which ensures that symmetry is robustly preserved in the model even if the symmetry transformation is “contaminated” by nuisance data deformation.

In the second part of the talk, I will demonstrate how to incorporate additional structural information (such as group symmetry) into generative adversarial networks (GANs) for data-efficient distribution learning. This is accomplished by developing new variational representations for divergences between probability measures with embedded structures. We study, both theoretically and empirically, the effect of structural priors in the two GAN players. The resulting structure-preserving GAN is able to achieve significantly improved sample fidelity and diversity—almost an order of magnitude measured in Fréchet Inception Distance—especially in the limited data regime.

Neural Networks with Inputs Based on Domain of Dependence and A Converging Sequence for Solving Conservation Laws

Series: Applied and Computational Mathematics Seminar
Time: Monday, February 28, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: https://bluejeans.com/457724603/4379
Speaker: Haoxiang Huang – GT – hcwong@gatech.edu

Recent research on solving partial differential equations with deep neural networks (DNNs) has demonstrated that spatiotemporal-function approximators defined by auto-differentiation are effective for approximating nonlinear problems. However, it remains a challenge to resolve discontinuities in nonlinear conservation laws using forward methods with DNNs without beginning with part of the solution. In this study, we incorporate first-order numerical schemes into DNNs to set up the loss function approximator instead of auto-differentiation from traditional deep learning framework such as the TensorFlow package, thereby improving the effectiveness of capturing discontinuities in Riemann problems. We introduce a novel neural network method. A local low-cost solution is first used as the input of a neural network to predict the high-fidelity solution at a space-time location. The challenge lies in the fact that there is no way to distinguish a smeared discontinuity from a steep smooth solution in the input, thus resulting in “multiple predictions” of the neural network. To overcome the difficulty, two solutions of the conservation laws from a converging sequence, computed from low-cost numerical schemes, and in a local domain of dependence of the space-time location, serve as the input. Despite smeared input solutions, the output provides sharp approximations to solutions containing shocks and contact surfaces, and the method is efficient to use, once trained. It works not only for discontinuities, but also for smooth areas of the solution, implying broader applications for other differential equations.

Coarse – Graining of stochastic system

Series: Applied and Computational Mathematics Seminar
Time: Monday, February 7, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: https://bluejeans.com/457724603/4379
Speaker: Prof. Xingjie "Helen" Li – UNC Charlotte

Efficient simulation of SDEs is essential in many applications, particularly for ergodic
systems that demand efficient simulation of both short-time dynamics and large-time
statistics. To achieve the efficiency, dimension reduction is often required in both space
and time. In this talk, I will talk about our recent work on both spatial and temporal
reductions.
For spatial dimension reduction, the Mori-Zwanzig formalism is applied to derive
equations for the evolution of linear observables of the Langevin dynamics for both
overdamped and general cases.
For temporal dimension reduction, we introduce a framework to construct inference-
based schemes adaptive to large time-steps (ISALT) from data, achieving a reduction in
time by several orders of magnitudes.
This is a joint work with Dr. Thomas Hudson from the University of Warwick, UK; Dr. Fei
Lu from the Johns Hopkins University and Dr Xiaofeng Felix Ye from SUNY at Albany.

How to Break the Curse of Dimensionality

Series: Applied and Computational Mathematics Seminar
Time: Monday, January 31, 2022 - 14:00 for 1 hour (actually 50 minutes)
Location: https://bluejeans.com/457724603/4379
Speaker: Ming-Jun Lai – University of Georgia

We first review the problem of the curse of dimensionality when approximating multi-dimensional functions. Several approximation results from Barron, Petrushev, Bach, and etc . will be explained.

Then we present two approaches to break the curse of the dimensionality: one is based on probability approach explained in Barron, 1993 and the other one is based on a deterministic approach using the Kolmogorov superposition theorem. As the Kolmogorov superposition theorem has been used to explain the approximation of neural network computation, I will use it to explain why the deep learning algorithm works for image classification.
In addition, I will introduce the neural network approximation based on higher order ReLU functions to explain the powerful approximation of multivariate functions using deep learning algorithms with multiple layers.

Non-Parametric Estimation of Manifolds from Noisy Data

Series: Applied and Computational Mathematics Seminar
Time: Monday, December 6, 2021 - 14:00 for 1 hour (actually 50 minutes)
Location: https://bluejeans.com/457724603/4379
Speaker: Yariv Aizenbud – Yale University – yariv.aizenbud@yale.edu

A common task in many data-driven applications is to find a low dimensional manifold that describes the data accurately. Estimating a manifold from noisy samples has proven to be a challenging task. Indeed, even after decades of research, there is no (computationally tractable) algorithm that accurately estimates a manifold from noisy samples with a constant level of noise.

In this talk, we will present a method that estimates a manifold and its tangent in the ambient space. Moreover, we establish rigorous convergence rates, which are essentially as good as existing convergence rates for function estimation.

Georgia Institute of Technology College of Sciences

Search form