Past Events

Adversarial training and the generalized Wasserstein barycenter problem

Data Science Seminar

Matt Jacobs (Purdue University)

Abstract

Adversarial training is a framework widely used by practitioners to enforce robustness of machine learning models. During the training process, the learner is pitted against an adversary who has the power to alter the input data. As a result, the learner is forced to build a model that is robust to data perturbations. Despite the importance and relative conceptual simplicity of adversarial training, there are many aspects that are still not well-understood (e.g. regularization effects, geometric/analytic interpretations, tradeoff between accuracy and robustness, etc...), particularly in the case of multiclass classification.

In this talk, I will show that in the non-parametric setting, the adversarial training problem is equivalent to a generalized version of the Wasserstein barycenter problem. The connection between these problems allows us to completely characterize the optimal adversarial strategy and to bring in tools from optimal transport to analyze and compute optimal classifiers. This also has implications for the parametric setting, as the value of the generalized barycenter problem gives a universal upper bound on the robustness/accuracy tradeoff inherent to adversarial training.

Joint work with Nicolas Garcia Trillos and Jakwang Kim

Overparametrization in machine learning: insights from linear models

Data Science Seminar

Andrea Montanari (Stanford University)

Abstract

Deep learning models are often trained in a regime that is forbidden by classical statistical learning theory. The model complexity can be larger than the sample size and the train error does not concentrate around the test error. In fact, the model complexity can be so large that the network interpolates noisy training data. Despite this, it behaves well on fresh test data, a phenomenon that has been dubbed `benign overfitting.'

I will review recent progress towards a precise quantitative understanding of this phenomenon in linear models and kernel regression. In particular, I will present a recent characterization of ridge regression in Hilbert spaces which provides a unified understanding on several earlier results.

[Based on joint work with Chen Cheng]

Meta-Analysis of Randomized Experiments: Applications to Heavy-Tailed Response Data

Industrial Problems Seminar

Dominique Perrault-Joncas (Amazon)

Abstract

A central obstacle in the objective assessment of treatment effect (TE) estimators in randomized control trials (RCTs) is the lack of ground truth (or validation set) to test their performance. In this paper, we propose a novel cross-validation-like methodology to address this challenge. The key insight of our procedure is that the noisy (but unbiased) difference-of-means estimate can be used as a ground truth “label" on a portion of the RCT, to test the performance of an estimator trained on the other portion. We combine this insight with an aggregation scheme, which borrows statistical strength across a large collection of RCTs, to present an end-to-end methodology for judging an estimator’s ability to recover the underlying treatment effect as well as produce an optimal treatment "roll out" policy. We evaluate our methodology across 699 RCTs implemented in the Amazon supply chain. In this heavy-tailed setting, our methodology suggests that procedures that aggressively downweight or truncate large values, while introducing bias, lower the variance enough to ensure that the treatment effect is more accurately estimated.

Lecture: Yuxin Chen

Data Science Seminar

Yuxin Chen (University of Pennsylvania)

Registration is required to access the Zoom webinar.

Title: Taming Nonconvexity in Tensor Completion: Fast Convergence and Uncertainty Quantification

Abstract: Recent years have witnessed a flurry of activity in solving statistical estimation and learning problems via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple first-order optimization methods have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently.

This talk explores the effectiveness of nonconvex optimization for noisy tensor completion --- the problem of reconstructing a low-CP-rank tensor from highly incomplete and randomly corrupted observations of its entries. While randomly initialized gradient descent suffers from a high-volatility issue in the sample-starved regime, we propose a two-stage nonconvex algorithm that is guaranteed to succeed, enabling linear convergence, minimal sample complexity and minimax statistical accuracy all at once. In addition, we characterize the distribution of this nonconvex estimator down to fine scales, which in turn allows one to construct entrywise confidence intervals for both the unseen tensor entries and the unknown tensor factors. Our findings reflect the important role of statistical models in enabling efficient and guaranteed nonconvex statistical learning.

Lecture: Roy Lederman

Data Science Seminar

Roy Lederman (Yale University)

Registration is required to access the Zoom webinar.

The Geometry of Molecular Conformations in Cryo-EM

Cryo-Electron Microscopy (cryo-EM) is an imaging technology that is revolutionizing structural biology. Cryo-electron microscopes produce many very noisy two-dimensional projection images of individual frozen molecules; unlike related methods, such as computed tomography (CT), the viewing direction of each particle image is unknown. The unknown directions and extreme noise make the determination of the structure of molecules challenging. While other methods for structure determination, such as x-ray crystallography and NMR, measure ensembles of molecules, cryo-electron microscopes produce images of individual particles. Therefore, cryo-EM could potentially be used to study mixtures of conformations of molecules. We will discuss a range of recent methods for analyzing the geometry of molecular conformations using cryo-EM data and some new issues that arise.

Math & Money: Career Paths in Financial Services

Industrial Problems Seminar

Margaret Holen (Princeton University)

Registration is required to access the Zoom webinar.

Abstract

We face financial choices every day. From buying a morning coffee, to an online shopping errand in the afternoon, we are asked “Cash, Credit or Debit?” and “Pay Now or Later?” We occasionally face bigger decisions, like whether to take out a car loan, or to open a retirement account, or to take out a life insurance policy.

The finance industry offers mathematicians a rich array of career opportunities. Many of those include working with new technologies, complex data sets, and novel algorithms. Whether or not you enter the industry, we all play roles as consumers and as citizens influencing regulations.
This talk will share an overview of the finance sector, core mathematical ideas important in it, and my career path through it. My goal is to inspire you make the most of your backgrounds to shape your financial futures and the future of this industry.

Lecture: Tamir Bendory

Data Science Seminar

Tamir Bendory (Tel Aviv University)

Registration is required to access the Zoom webinar.

Title: Multi-reference alignment: Representation theory perspective, sparsity, and projection-based algorithms

Abstract: Multi-reference alignment (MRA) is the problem of recovering a signal from its multiple noisy copies, each acted upon by a random group element. MRA is mainly motivated by single-particle cryo-electron microscopy (cryo-EM): a leading technology to reconstruct biological molecular structures. In this talk, I will analyze the second moment of the MRA and cryo-EM models. First, I will show that in both models the second moment determines the signal up to a set of unitary matrices, whose dimension is governed by the decomposition of the space of signals into irreducible representations of the group. Second, I will present sparsity conditions under which a signal can be recovered from the second moment, implying that the sample complexity is proportional to the square of the variance of the noise. If time permits, I will introduce a new computational framework for cryo-EM that combines a sparse representation of the molecule with projection-based techniques used for phase retrieval in X-ray crystallography.

Identifying and achieving career goals

Industrial Problems Seminar

Brittany Baker (The Hartford)

Registration is required to access the Zoom webinar.

Abstract

So you’re doing the grad school thing. What next? Do you stay in academia or take a different path? How do you identify what you’d be good at and what you want to do? Once you identify elements of your ideal career, how do you figure out what to do next? I will discuss my career path, the constants, challenges, and adjustments I’ve made along the way. I will also describe my current job as a Data Science in the Auto Claims department for The Hartford Insurance Group and highlight some skills that have helped me achieve my goals.

Lecture: Nadav Dym

Data Science Seminar

Nadav Dym (Technion-Israel Institute of Technology)

Registration is required to access the Zoom webinar.

Title:

Efficient Invariant Embeddings for Universal Equivariant Learning

Abstract:

In many machine learning tasks, the goal is to learn an unknown function which has some known group symmetries. Equivariant machine learning algorithms exploit this by devising architectures (=function spaces) which have these symmetries by construction. Examples include convolutional neural networks which respect the translation symmetry of images, and neural networks for graphs or sets which respect their permutation symmetries. More examples will be discussed…

A common theoretical requirement of an equivariant architecture is that it will be universal- meaning that it can approximate any continuous equivariant function. This question typically boils down to another theoretical question: assume that we have a group G acting on a set V, can we find a mapping f:V→R^m such that f is G invariant, and on the other hand f separates and two points in V which are not related by a G-symmetry? Such a mapping is essentially an injective embedding of the quotient space V/G into R^m, which can then be used to prove universality. We will review results showing that under very general assumptions such a mapping f exists, and the embedding dimension m can be taken to be 2dim(V)+1. We will show that in some cases (e.g., graphs) computing such an f can be very expensive, and will discuss our methodology for efficient computation of such f in other cases (e.g., sets). This methodology is a generalization of the algebraic geometry argument used for the well known proof of phase retrieval injectivity.

Based on work with Steven J. Gortler

An Overview of Open Problems in Autonomous Systems

Industrial Problems Seminar

Natalia Alexandrov (NASA Langley Research Center)

Registration is required to access the Zoom webinar.

Abstract

The motivating applications for this talk are the coming complex environments, such as urban air mobility, multi-agent multi-modal navigation through cluttered, GPS-denied environments, such as wooded areas or disaster areas, including buildings, and autonomous operations in future space missions.

Although humans will continue as active system participants, increasing system complexity will demand growing degree of machine autonomy. We can make good use of the developments in autonomous cars. However, the airspace environment is much less forgiving and presents special problems. It’s safety critical, time critical, and depends on certification. Question is: when can we trust an autonomous system in such environments? In this talk, I will give examples of open problems in the design and operation of autonomous systems and suggest where mathematical attention would be in order.