Past Events

Math-to-Industry Boot Camp VIII

Organizers:

The Math-to-Industry Boot Camp is an intense six-week session designed to provide graduate students with training and experience that is valuable for employment outside of academia. The program is targeted at Ph.D. students in pure and applied mathematics. The boot camp consists of courses in the basics of programming, data analysis, and mathematical modeling. Students work in teams on projects and are provided with training in resume and interview preparation as well as teamwork.

There are two group projects during the session: a small-scale project designed to introduce the concept of solving open-ended problems and working in teams, and a "capstone project" that is posed by industrial scientists. Recent industrial sponsors included Cargill, Securian Financial and CH Robinson.  Weekly seminars by speakers from many industry sectors provide the students with opportunities to learn about a variety of possible future careers.

Capstone Projects

Evaluating the real-world safety and robustness of deep learning models

Charles Godfrey, Pacific Northwest National Laboratory
Henry Kvinge, Pacific Northwest National Laboratory

Team: Kean Fallon, Iowa State University; Aidan Lorenz, Vanderbilt University; Jessie Loucks-Tavitas, University of Washington; Sandra Annie Tsiorintsoa, Clemson University; Benjamin Warren, Texas A&M University

Abstract: Deep learning has shown remarkable capabilities in a range of important tasks, but at the same time has also been shown to be brittle in many ways, especially in real-world deployed environments. This can range from a language model that can be triggered to insult users to a vision model that doesn’t recognize machine parts in certain lighting conditions. Understanding how a model will behave at deployment is a serious problem in real-world AI and requires a mixture of mathematical and out-of-the-box thinking. In this capstone project, participants will be asked to evaluate models delivered to a client by 3rd party vendors from the perspective of overall robustness. This will include (i) evaluating how well a model performs outside of its test set and (ii) particular failure modes of the model that should be avoided. The final project deliverable will include a short report recommending for or against proceeding with use of the model in the client’s application.


Interpolating the Implied Volatility Surface

Chris Bemis, X Cubed Capital Management

Team: Qinying Chen, University of Delaware; Nellie Garcia, University of Minnesota; Emily Gullerud, University of Minnesota; Shaoyu Huang, University of Pittsburgh; Pascal Kingsley (PK) Kataboh, University of Delaware; Matthew Williams, Colorado State University

Abstract: Financial markets price volatility in underlying securities primarily through what are called options.  These options are defined by reference to their payoffs and the date at which they expire, along with other features such as prevailing rates, the underlying security price, and so on.  The result is that markets reference a surface of implied volatilities based on market prices. 
 
In this project, we will fit such surfaces in financially meaningful ways; especially focusing on the preclusion of arbitrage opportunities in the resulting interpolation.  These methods are critical in creating assessments of constant expiry volatility time series amongst many other applications.  They also sometimes suffer from a lack of stability in parameter estimation as new surfaces are fit.
 
We will use real (and noisy) data with the goal of efficiently creating stable volatility surface interpolations and time series of constant expiry volatility.


Multimodal Search in eCommerce

Christopher Miller, eBay

Team: Tanuj Gupta, Texas A&M University; Meagan Kenney, University of Minnesota; Pavel Kovalev, Indiana University; Chiara Mattamira, University of Tennessee; Jeremy Shahan, Louisiana State University; Hannah Solomon, Texas A&M University

Abstract: In classical search, most or all user interfaces are text-based. Users submit queries made up of strings, and possibly assert filters (also text-based) to limit the result set. When a user does not like their results, they can “requery” with slightly different terms to produce better results. This process continues until the user is satisfied or gives up.

In visual search, users submit images for their queries. The query images might come from the internet, other eCommerce sites, or from the users’ own library. They expect to see results that look similar to their query image. If the user does not like their results, now they’re stuck: they cannot simply tweak an image the way they can tweak a text-based query.

This is the problem we will resolve with a two-phase multimodal search. The first phase is regular visual search. In the second phase, users can add text to their query to augment their search results. For example, they submit a photo of a yellow dress they have at home, but add the text “green dress” to get results that are green, but otherwise similar to the dress they already have. This enables users to iteratively improve their search results just like they would in classical search.


An Excess Demand Model of Home Price Appreciation

Christopher Jones, US Bank
Matt Mansell, US Bank
Leo Digiosia, US Bank

Team: Ismail Aboumal, California Institute of Technology; Daniela Beckelhymer, University of Minnesota; Jarrad Botchway, Missouri University of Science and Technology; Jordan Pellett, University of Tennessee; Marshall Smith, University of Minnesota

Abstract: National home prices in the U.S. are tracked by one of a few indices: the Case-Shiller and FHFA home price indices being the most popular. Home price appreciation is an important metric tracked by commercial banks. Because bank originators have the ability to hold mortgages on their balance sheets, refinance activity such as cash-out and rate/term refinance contribute to the interest rate and macroeconomic risk of these assets. Traditionally, home prices are forecasted using a form of econometric regression where multiple correlated variables are used in a model. However, these models often lead decision-makers in a bank lacking in terms of interpretability or insights into the mortgage market. In this project, we will explore home prices from the point of view of a differential equation so that we can obtain forecasted values of home price appreciation on a variety of time scales. We will explore the conceptual soundness of a model of excess demand and quantify uncertainty around parameter estimation and shape optimization. We will create a story around this model to explain past events and potential future scenarios.

Workshop on Random Structures in Optimizations and Related Applications

Applications due April 30.

Scope:

  • This summer program aims to promote the studies and research activities on random optimizations in complex systems for Minnesota's local undergraduate students.
  • The workshop will cover a wide range of subjects and tools in probability theory and mathematical physics, especially addressing their applications in machine learning, data science, and imaging processing.
  • During the 10-day program, students are expected to attend two daily lecture sessions and a group problem session. Additional professional development sessions will discuss graduate school and careers in related fields.
  • Upon completion, students will receive a certificate issued by the School of Mathematics at the University of Minnesota.

Who can apply

Undergraduate students from Minnesota's local colleges and universities. 

Prerequisites

Introductory Probability, Linear Algebra, and Basic Properties of Differential Equations

Schedule

Week 1: June 5-9

Time Instructor Topic
9-10:15am Wei-Kuo Chen Statistical Physics and Random Optimizations
10:45am-12pm Arnab Sen Clustering and Community Detection
1:30-3:30pm Ratul Biswas Discussion and Problem Session

Week 2: June 12-16

Time Instructor Topic
9:00-10:15am Rishabh Dudeja Universality in High-Dimensional Optimization and Statistics Detection
10:45am-12:00pm Wonjun Lee Introduction to Computational Optimal Transport
1:30-3:30pm Heejune Kim Discussion and Problem Session

Application

Application materials:

  1. A brief CV
  2. A short recommendation letter from a professor
  3. Personal statement describing scientific interests and course preparations for this workshop

When filling in the Application Form, please only select either "Local expenses (hotel and meals)" or "Not requesting funding."

Apply by April 30

Financial support

The participants will receive either a fixed per diem or a meal plan to cover food. Support is available for students in need of on-campus lodging during the program.

Organizer

Wei-Kuo Chen (University of Minnesota)


This program is financially supported by the National Science Foundation and Institute for Mathematics and Its Applications. 

Developing Online Learning Experiments Using Doenet (2023)

Apply to attend

Organizers

In this five-day workshop, participants will learn how to create and implement online learning experiments using the Distributed Open Education Network (Doenet, doenet.org). Doenet is designed to help faculty critically evaluate how different content choices influence student learning in their classrooms. Doenet enables instructors to quickly test hypotheses regarding the relative effectiveness of alternative approaches by providing tools to assign different variations of an activity and analyze the resulting data.

Following brief introductions and demos of features of the Doenet platform, participants will work in small groups to develop learning experiments that can be used in the college classroom, assisted by the developers of Doenet. The expectation is that participants will leave the workshop with a learning experiment that they can use in their classroom the following year.

The workshop will run from 9 AM on Monday, May 22 through 4 PM on Friday, May 26. All organized activities will occur between 9 AM and 4 PM each day.

The workshop is open to faculty at all levels teaching STEM courses.

To apply, please submit the following documents through the Program Application link at the top of the page:

  1. A personal statement briefly (200 words or less) stating what you hope to contribute to the discussion on learning experiments and what you hope to gain from this workshop. Include courses you teach for which you'd like to develop learning experiments. Priority will be given to those able to run learning experiments in their courses in the following year.
  2. A brief CV or resume. (A list of publications is not necessary.)

This workshop is fully funded by the National Science Foundation. All accepted participants who request funding for travel and/or local expenses will receive support. There is no registration fee.

Participants who perform learning experiments on Doenet during the following academic year will be eligible to receive a small stipend to support their work.

Deadline for full consideration: April 17, 2023.

Supported by NSF grant DUE 1915363.

ScreeNOT: Optimal Singular Value Thresholding and Principal Component Selection in Correlated Noise

Data Science Seminar

Elad Romanov (Stanford University)

Abstract

Principal Component Analysis (PCA) is a fundamental and ubiquitous tool in statistics and data analysis.

The bare-bones idea is this. Given a data set of n points y_1, ..., y_n, form their sample covariance S. Eigenvectors corresponding to large eigenvalues--namely directions along which the variation within the data set is large--are usually thought of as "important"  or "signal-bearing"; in contrast, weak directions are often interpreted as "noise", and discarded along the proceeding steps of the data analysis pipeline. Principal component (PC) selection is an important methodological question: how large should an eigenvalue be so as to be considered "informative"?

Our main deliverable is ScreeNOT: a novel, mathematically-grounded procedure for PC selection. It is intended as a fully algorithmic replacement for the heuristic and somewhat vaguely-defined procedures that practitioners often use--for example the popular "scree test".

Towards tackling PC selection systematically, we model the data matrix as a low-rank signal plus noise matrix Y = X + Z; accordingly, PC selection is cast as an estimation problem for the unknown low-rank signal matrix X, with the class of permissible estimators being singular value thresholding rules. We consider a formulation of the problem under the spiked model. This asymptotic setting captures some important qualitative features observed across numerous real-world data sets: most of the singular values of Y are arranged neatly in a "bulk", with very few large outlying singular values exceeding the bulk edge. We propose an adaptive algorithm that, given a data matrix, finds the optimal truncation threshold in a data-driven manner under essentially arbitrary noise conditions: we only require that Z has a compactly supported limiting spectral distribution--which may be a priori unknown. Under the spiked model, our algorithm is shown to have rather strong oracle optimality properties: not only does it attain the best error asymptotically, but it also achieves (w.h.p.) the best error--compared to all alternative thresholds--at finite n.

This is joint work with Matan Gavish (Hebrew University of Jerusalem) and David Donoho (Stanford).

Some Elementary Economics (& Physics) of the Electricity Grid

Industrial Problems Seminar

Sriharsha (Harsha) Veeramachaneni (WindLogics)

Abstract

The simplified tutorial-style talk will delve into the fundamentals of how electricity is transacted in the north-American grid: How electricity prices are determined; How the physics of electricity flow affects prices, and some counterintuitive consequences thereof; And how all this relates to the business of profitably operating a power plant.

Squishy Mathematical Reasoning in a Robotics Start-up

Industrial Problems Seminar

Michelle Snider (Service Robotics & Technologies)

Abstract

Service Robotics & Technologies (SRT Labs) brings legacy infrastructure, smart sensors, and collaborative robotics into a unified data management ecosystem in order to monitor, analyze and automate systems.  Applications range from smart laboratories to smart buildings to smart cities. The smart technology space provides a wealth of interesting projects which may not immediately sound like math problems but whose solutions often greatly benefit from a mathematical perspective. In this talk, I will discuss some different projects where my team applied mathematical approaches to find realistically implementable solutions, interspersed with career lessons learned along the way.

A Varied and Winding Math Career in Industry

Industrial Problems Seminar

Laura Lurati (Edward Jones)

Abstract

In this talk, I'll share my personal career path as an applied mathematician both from the perspective of the various industries I've worked in (aerospace, finance, real estate, and software engineering) and my own transition from an individual contributor to management. I'll give an overview of the types of problems I worked on in each of these fields and the common skills that have helped me throughout my career. Finally, I will share some of my work as a builder of high-performing teams, the rewards of management, and what I look for in candidates when hiring new teammates.  As a key message, I hope to share that a career in applied mathematics can take very interesting turns if you are open to new possibilities and continual learning. 
 

Learning in Stochastic Games

Data Science Seminar

Muhammed Omer Sayin (Bilkent University)

Abstract

Reinforcement learning (RL) has been the backbone of many frontier artificial intelligence (AI) applications, such as game playing and autonomous driving, by addressing how intelligent and autonomous systems should engage with an unknown dynamic environment. The progress and interest in AI are now transforming social systems with human decision-makers, such as (consumer/financial) markets and road traffic, into socio-technical systems with AI-powered decision-makers. However, self-interested AI can undermine the social systems designed and regulated for humans. We are delving into the uncharted territory of AI-AI and AI-human interactions. The new grand challenge is to predict and control the implications of AI selfishness in AI-X interactions with systematic guarantees. Hence, there is now a critical need to study self-interested AI dynamics in complex and dynamic environments through the lens of game theory.

In this talk, I will present the recent steps we have taken toward the foundation of how self-interested AI would and should interact with others by bridging the gap between game theory and practice in AI-X interactions. I will specifically focus on stochastic games to model the interactions in complex and dynamic environments since they are commonly used in multi-agent reinforcement learning. I will present new learning dynamics converging almost surely to equilibrium in important classes of stochastic games. The results can also be generalized to the cases where (i) agents do not know the model of the environment, (ii) do not observe opponent actions, (iii) can adopt different learning rates, and (iv) can be selective about which equilibrium they will reach for efficiency. The key idea is to use the power of approximation thanks to the robustness of learning dynamics to perturbations. I will conclude my talk with several remarks on possible future research directions for the framework presented.

IMA Data Science Seminar - Learning in Stochastic Games

Muhammed Omer Sayin (Bilkent University) will give a presentation entitled Learning in Stochastic Games.

Continuous-time probabilistic generative models for dynamic networks

Data Science Seminar

Kevin Xu (Case Western Reserve University)

Abstract

Networks are ubiquitous in science, serving as a natural representation for many complex physical, biological, and social systems. Probabilistic generative models for networks provide plausible mechanisms by which network data are generated to reveal insights about the underlying complex system. Such complex systems are often time-varying, which has led to the development of dynamic network representations to enable modeling, analysis, and prediction of temporal dynamics.

In this talk, I introduce a class of continuous-time probabilistic generative models for dynamic networks that augment statistical models for network structure with multivariate Hawkes processes to model temporal dynamics. The class of models allows an analyst to trade off flexibility and scalability of a model depending on the application setting. I focus on two specific models on opposite ends of the tradeoff: the community Hawkes independent pairs (CHIP) model that scales up to millions of nodes, and the multivariate Community Hawkes (MULCH) model that is flexible enough to replicate a variety of observed structures in real network data, including temporal motifs. I demonstrate how these models can be used for analysis, prediction, and simulation on several real network data sets, including a network of militarized disputes between countries over time.