Past Events

Lecture: Tamir Bendory

Data Science Seminar

Tamir Bendory (Tel Aviv University)

Registration is required to access the Zoom webinar.

Title: Multi-reference alignment: Representation theory perspective, sparsity, and projection-based algorithms

Abstract: Multi-reference alignment (MRA) is the problem of recovering a signal from its multiple noisy copies, each acted upon by a random group element. MRA is mainly motivated by single-particle cryo-electron microscopy (cryo-EM): a leading technology to reconstruct biological molecular structures. In this talk, I will analyze the second moment of the MRA and cryo-EM models. First, I will show that in both models the second moment determines the signal up to a set of unitary matrices, whose dimension is governed by the decomposition of the space of signals into irreducible representations of the group. Second, I will present sparsity conditions under which a signal can be recovered from the second moment, implying that the sample complexity is proportional to the square of the variance of the noise. If time permits, I will introduce a new computational framework for cryo-EM that combines a sparse representation of the molecule with projection-based techniques used for phase retrieval in X-ray crystallography.

 

Identifying and achieving career goals

Industrial Problems Seminar

Brittany Baker (The Hartford)

Registration is required to access the Zoom webinar.

Abstract

So you’re doing the grad school thing. What next? Do you stay in academia or take a different path? How do you identify what you’d be good at and what you want to do? Once you identify elements of your ideal career, how do you figure out what to do next? I will discuss my career path, the constants, challenges, and adjustments I’ve made along the way. I will also describe my current job as a Data Science in the Auto Claims department for The Hartford Insurance Group and highlight some skills that have helped me achieve my goals.
 

Lecture: Nadav Dym

Data Science Seminar

Nadav Dym (Technion-Israel Institute of Technology)

Registration is required to access the Zoom webinar.

Title:
Efficient Invariant Embeddings for Universal Equivariant Learning
 
Abstract:
In many machine learning tasks, the goal is to learn an unknown function which has some known group symmetries. Equivariant machine learning algorithms exploit this by devising architectures (=function spaces) which have these symmetries by construction. Examples include convolutional neural networks which respect the translation symmetry of images, and neural networks for graphs or sets which respect their permutation symmetries. More examples will be discussed…

A common theoretical requirement of an equivariant architecture is that it will be universal- meaning that it can approximate any continuous equivariant function. This question typically boils down to another theoretical question: assume that we have a group G acting on a set V, can we find a mapping f:V→R^m such that f is G invariant, and on the other hand f separates and two points in V which are not related by a G-symmetry? Such a mapping is essentially an injective embedding of the quotient space V/G into R^m, which can then be used to prove universality. We will review results showing that under very general assumptions such a mapping f exists, and the embedding dimension m can be taken to be 2dim(V)+1. We will show that in some cases (e.g., graphs) computing such an f can be very expensive, and will discuss our methodology for efficient computation of such f in other cases (e.g., sets). This methodology is a generalization of the algebraic geometry argument used for the well known proof of phase retrieval injectivity.

Based on work with Steven J. Gortler
 
 

An Overview of Open Problems in Autonomous Systems

Industrial Problems Seminar

Natalia Alexandrov (NASA Langley Research Center)

Registration is required to access the Zoom webinar.

Abstract

The motivating applications for this talk are the coming complex environments, such as urban air mobility, multi-agent multi-modal navigation through cluttered, GPS-denied environments, such as wooded areas or disaster areas, including buildings, and autonomous operations in future space missions.

Although humans will continue as active system participants, increasing system complexity will demand growing degree of machine autonomy. We can make good use of the developments in autonomous cars. However, the airspace environment is much less forgiving and presents special problems. It’s safety critical, time critical, and depends on certification. Question is: when can we trust an autonomous system in such environments? In this talk, I will give examples of open problems in the design and operation of autonomous systems and suggest where mathematical attention would be in order.
 

Lecture: March Boedihardjo

Data Science Seminar

March Boedihardjo (ETH Zürich)

Registration is required to access the Zoom webinar.

Title: Spectral norm of random matrices
 
Abstract: Tropp's matrix concentration inequalities give sharp estimates for the spectral norm of many random matrices that arise in applications. However, even in the case when all the entries are i.i.d. standard Gaussian, the estimates are only sharp up to a log-dimension factor but not even sharp up to a constant. I will present an estimate for sums of independent random matrices that is sharp in many cases including the case when all the entries are i.i.d. standard Gaussian. Joint work with Afonso Bandeira and Ramon van Handel.
 
 

Lecture: Luke Jacobsen and Jeff Lande

Industrial Problems Seminar

Luke Jacobsen (Medtronic), Jeff Lande (Medtronic)

Title: Quantitative Careers in the Medical Device Industry

Abstract: We will give an overview of quantitative careers in the medical device industry, focusing on the role of the biostatistician in the Cardiac Rhythm Management (CRM) space.  We will describe some CRM products and provide examples of work within clinical studies to demonstrate the safety and efficacy of these products including the use of alternative data sources to help address relevant clinical questions.
 
 

Lecture: Mauro Maggioni

Data Science Seminar

Mauro Maggioni (Johns Hopkins University)

You may attend the talk either in person in Walter 402 or register via Zoom. Registration is required to access the Zoom webinar.

Title: Two estimation problems for dynamical systems: linear systems on graphs, and interacting particle systems

Abstract: We are interested in problems where certain key parameters of a dynamical system need to be estimated from observations of trajectories of the dynamical systems. In this talk I will discuss two problems of this type.

The first one is the following: suppose we have a linear dynamical systems on a graph, represented by a matrix A. For example, A may be a random walk on the graph. Suppose we observe some entries of A, some entries of A^2, …, some entries of A^T, for some time T, and wish to estimate A. We are interested in the regime when the number of entries observed at each time is small relative to the total number of entries of A. When T=1 and A is low-rank, this is a matrix completion problem. When T>1, the problem is interesting also in the case when A is not low rank, as one may hope that sampling at multiple times can compensate for the small number of entries observed at each time. We develop conditions that ensure that this estimation problem is well-posted, introduce a procedure for estimating A by reducing the problem to the matrix completion of a low-rank structured block-Hankel matrix, obtain results that capture at least some of trade-offs between sampling in space and time, and finally show that this estimator can be constructed by a fast algorithm that provably locally converges quadratically to A. We verify this numerically on a variety of examples. This is joint work with C. Kuemmerle and S. Tang.

The second problem is when the dynamical system is nonlinear, and models a set of interacting agents. These systems are ubiquitous in science, from modeling of particles in Physics to prey-predator and colony models in Biology, to opinion dynamics in social sciences. Oftentimes the laws of interactions between the agents are quite simple, for example they depend only on pairwise interactions, and only on pairwise distance in each interaction. We consider the following inference problem for a system of interacting particles or agents: given only observed trajectories of the agents in the system, can we learn what the laws of interactions are? We would like to do this without assuming any particular form for the interaction laws, i.e. they might be “any” function of pairwise distances. We discuss when this problem is well-posed, we construct estimators for the interaction kernels with provably good statistically and computational properties, and discuss extensions to second-order systems, more general interaction kernels, and stochastic systems. We measure empirically the performance of our techniques on various examples, that include extensions to agent systems with different types of agents, second-order systems, families of systems with parametric interaction kernels, and settings where the interaction kernels depend on unknown variables. We also conduct numerical experiments to test the large time behavior of these systems, especially in the cases where they exhibit emergent behavior. This is joint work with F. Lu, J. Feng, P. Martin, J.Miller, S. Tang and M. Zhong.

Optimal shrinkage of singular values under noise with separable covariance & its application to fetal ECG analysis

Data Science Seminar

Pei-Chun Su (Duke University)

You may attend the talk either in person in Walter 402 or register via Zoom. Registration is required to access the Zoom webinar.

Abstract

High dimensional noisy dataset is commonly encountered in many scientific fields, and a critical step in data analysis is denoising. Under the white noise assumption, optimal shrinkage has been well-developed and widely applied to many problems. However, in practice, noise is usually colored and dependent, and the algorithm needs modification. We introduce a novel fully data-driven optimal shrinkage algorithm when the noise satisfies the separable covariance structure. The novelty involves a precise rank estimation and an accurate imputation strategy. In addition to showing theoretical supports under the random matrix framework, we show the performance of our algorithm in simulated datasets and apply the algorithm to extract fetal electrocardiogram from the benchmark trans-abdominal maternal electrocardiogram, which is a special single-channel blind source separation challenge.

Data Science to Software Engineering and Back Again

Industrial Problems Seminar

Cora Brown (Bridge Financial Technology)

You may attend the talk either in person in Walter 402 or register via Zoom. Registration is required to access the Zoom webinar.

Abstract

In this talk I will discuss my early career as a Data Scientist and Software Engineer. The skills necessary for these two types of roles overlap and complement each other. Drawing on my experiences in both fields, I will share some of the skills I’ve found valuable in each position and why I’ve chosen to follow this path. I will focus on the ways in which developing solid software skills have made me a better Data Scientist. Finally, I will describe some of the specific problems I’ve worked on as a Data Scientist and Software Engineer and how a background in mathematics can aid in solving these problems.

 

Equivariant machine learning

Data Science Seminar

Soledad Villar (John Hopkins University)

Abstract

In this talk we will give an overview of the enormous progress in the last few years, by several research groups, in designing machine learning methods that respect the fundamental symmetries and coordinate freedoms of physical law. Some of these frameworks make use of irreducible representations, some make use of high-order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries, but a large fraction (possibly all) of classical physics is equivariant to translation, rotation, reflection (parity), boost (relativity), units scalings, and permutations. We show that it is simple to parameterize universally approximating polynomial functions that are equivariant under these symmetries, or under the Euclidean, Lorentz, and Poincare groups, at any dimensionality d. The key observation is that nonlinear O(d)-equivariant (and related-group-equivariant) functions can be universally expressed in terms of a lightweight collection of (dimensionless) scalars -- scalar products and scalar contractions of the scalar, vector, and tensor inputs. We complement our theory with numerical examples that show that the scalar-based method is simple, efficient, and scalable, and mention ongoing work on cosmology simulations.