Past Events

Normalization effects and mean field theory for deep neural networks

Data Science Seminar

Konstantinos Spiliopoulos (Boston University)

Abstract

We study the effect of normalization on the layers of deep neural networks. A given layer $i$ with $N_{i}$ hidden units is allowed to be normalized by $1/N_{i}^{\gamma_{i}}$ with $\gamma_{i}\in[1/2,1]$ and we study the effect of the choice of the $\gamma_{i}$ on the statistical behavior of the neural network’s output (such as variance) as well as on the test accuracy on the MNIST and CIFAR10 data sets. We find that in terms of variance of the neural network’s output and test accuracy the best choice is to choose the $\gamma_{i}$’s to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network’s behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network’s output and corresponding mean field analysis. An important practical consequence of the analysis is that it provides a systematic and mathematically informed way to choose the learning rate hyperparameters. Such a choice guarantees that the neural network behaves in a statistically robust way as the $N_i$'s grow to infinity.

Language and graph foundational models: Distillation and pretraining

Industrial Problems Seminar

Vasileios Ioannidis (Amazon Search AI)

Please note the 10:10am start time.

Abstract

Graph neural networks (GNNs) learn from complex graph data and have been remarkably successful in various applications and across industries. This presentation first introduces GNNs via the message passing framework and dives into popular GNN variants. Next, it explores the fusion of textual data with heterogeneous graph structures to improve semantic and behavioral representations. It introduces the Language Model GNN (LM-GNN), a framework that efficiently combines large language models and Graph Neural Networks (GNNs) through fine-tuning. LM-GNN supports various tasks like node classification and link prediction and demonstrates its effectiveness. Another aspect addressed is the challenge of effective node representation learning in textual graphs. The Graph-Aware Distillation (Grad) framework is proposed, which encodes graph structures into a Language Model (LM) to enable fast and scalable inference. Grad optimizes GNN and a graphless student model, resulting in superior performance in node classification tasks. Finally, the presentation discusses pre-training text and graph models on large, heterogeneous graphs with textual data using the Graph-Aware Language Model Pre-Training (GALM) framework. It highlights the framework's effectiveness through experiments on real datasets.

Autoencoders for time series anomaly detection

Industrial Problems Seminar 

Parker Williams (Rivian Automotive)

Abstract

Autoencoders are a type of network designed to learn efficient encodings of data, typically for purposes of unsupervised data compression. I will outline a process to leverage autoencoders for unsupervised anomaly detection, which has become an essential tool in edge based system health monitoring. I will begin with a naive implementation and motivate an autoencoder variation from an anomaly detection perspective. We will then go through a few examples and implementation challenges encountered in the wild. We will end with broader observations on when this methodology can be effective and lessons learned from an organizational and software engineering perspective.

Data Driven Modeling of Unknown Systems with Deep Neural Networks

Data Science Seminar

Dongbin Xiu (The Ohio State University)

Abstract

We present a framework of predictive modeling of unknown systems from measurement data. The method is designed to discover/approximate the unknown evolution operator, i.e., flow map, behind the data. Deep neural network (DNN) is employed to construct such an approximation. Once an accurate DNN model for the evolution operator is constructed, it serves as a predictive model for the unknown system and enables us to conduct system analysis. We demonstrate that flow map learning (FML) approach is applicable for modeling a wide class of problems, including dynamical systems, systems with missing variables and hidden parameters, as well as partial differential equations (PDEs).

The Impact of Linear Constraints in Mean-Variance Optimization

Industrial Problems Seminar 

Christopher Bemis (X Cubed Capital Management)

Abstract

We study the effect linear constraints have on risk in the context of mean variance optimization (MVO). Jagannathan and Ma (2003) establish an equivalence between certain constrained and unconstrained MVO problems via a modification of the covariance matrix. We extend their results to arbitrary linear constraints and provide alternative interpretations for the effect of constraints on both the input parameters to the problems at hand and why ex-post performance is improved in the constrained setting.  In addition, we present a signal modification strategy similar in approach to that of Black-Litterman.

Trading off accuracy for reduced computation in scientific computing

Data Science Seminar

Alex Gittens (Rensselaer Polytechnic Institute)

Abstract

Classical linear algebraic algorithms guarantee high accuracy in exchange for high computational cost. These costs can be infeasible in modern applications, so over the last two decades, randomized algorithms have been developed that allow a user-specified trade-off between accuracy and computational efficiency when dealing with massive data sets. The intuition is that when dealing with an excess of structured data (e.g., a large matrix which has low numerical rank), one can toss away a large portion of this data, thereby reducing the computational load, without introducing much additional error into the computation. In this talk we look at the design and performance analysis of several numerical linear algebra and machine learning algorithms--- including linear solvers, approximate kernel machines, and tensor low-rank decomposition--- based upon this principle.

Computational mean-field games: from conventional methods to deep generative models

Data Science Seminar

Jiajia Yu (Duke University)

Abstract

Mean-field games study the behavior of a large number of rational agents in a non-cooperative game. It has wide applications in various fields. But it is not easy to solve the mean-field game numerically because of its complicated structure.

In the first part of my talk, I will present an efficient and flexible algorithm for dynamic mean-field games. The algorithm is based on an accelerated proximal gradient method. It consists of an easy-to-implement gradient descent step and a projection step equivalent to solving an elliptic equation. We also extend the setting of mean-field games and the algorithm to manifolds. In the second part of my talk, I will bridge mean-field games with a deep generative model which is called normalizing flows. The connection gives a computational approach for high-dimensional mean-field games and improves the training of the generative model.

The first part is based on joint works with Rongjie Lai (Purdue), Wuchen Li (UofSC) and Stanley Osher (UCLA). The second part is based on a joint work with Han Huang (RPI), Rongjie Lai (Purdue) and Jie Chen (IBM).

Distinct spatiotemporal tumor-immune ecologies define therapeutic response in NSCLC patients

Industrial Problems Seminar 

Sandhya Prabhakaran (Moffitt Cancer Centre)

Abstract

The talk will be geared towards a general audience. The goal of this talk is to explain importance of data, and the many ways data can be analyzed to benefit patient care. In this talk, I will focus on Non-small cell lung cancer (NSCLC), the patient data we obtained, the computational approaches used, and the potential biomarkers we identified in this process.

How much can one learn a PDE from its solution?

Data Science Seminar

Yimin Zhong (Auburn University)

Abstract

In this work we study a few basic questions for PDE learning from observed solution data. Using various types of PDEs, we show 1) how the approximate dimension (richness) of the data space spanned by all snapshots along a solution trajectory depends on the differential operator and initial data, and 2) identifiability of a differential operator from solution data on  local patches. Then we propose a consistent and sparse local regression method (CaSLR) for general PDE identification. Our method is data driven and requires minimal amount of local measurements in space and time from a single solution trajectory by enforcing global consistency and sparsity.

Navigating Interdisciplinary Research as a Mathematician

Industrial Problems Seminar

Julie Mitchell (Oak Ridge National Laboratory)

Abstract

Being effective in industrial and team science settings requires the ability to work across disciplines. In this talk, I will reflect on how to be successful working across disciplines and what types of opportunities exist for mathematicians working at national laboratories. I will also reflect on past projects I’ve pursued, which include high-performance computing and machine learning approaches to the understanding of macromolecular structure and binding.