Past Events

Math-to-Industry Boot Camp VI

Advisory: Application deadline is March 7, 2021

2021 Summer Boot Camp poster

Organizers:

Thomas Hoft, University of St. Thomas
Daniel Spirn, University of Minnesota, Twin Cities

The Math-to-Industry Boot Camp is an intense six-week session designed to provide graduate students with training and experience that is valuable for employment outside of academia. The program is targeted at Ph.D. students in pure and applied mathematics. The boot camp consists of courses in the basics of programming, data analysis, and mathematical modeling. Students work in teams on projects and are provided with training in resume and interview preparation as well as teamwork.

There are two group projects during the session: a small-scale project designed to introduce the concept of solving open-ended problems and working in teams, and a "capstone project" that is posed by industrial scientists. Recent industrial sponsors included D-Wave Systems, Exxonmobil, Los Alamos National Laboratories, Milwaukee Brewers, Starbucks.

Weekly seminars by speakers from many industry sectors provide the students with opportunities to learn about a variety of possible future careers.

Eligibility

Applicants must be current graduate students in a Ph.D. program at a U.S. institution during the period of the boot camp.

Logistics

The program will take place online. Students will receive a $800 stipend.

Applications

To apply, please supply the following materials through the link at the top of the page:

Statement of reason for participation, career goals, and relevant experience
Unofficial transcript, evidence of good standing, and have full-time status
Letter of support from advisor, director of graduate studies, or department chair

Selection criteria will be based on background and statement of interest, as well as geographic and institutional diversity. Women and minorities are especially encouraged to apply. Selected participants will be contacted in April.

Participants

Name	Department	Affiliation
Douglas Armstrong	Department of Data Science	Securian Financial
Yuchen Cao	Department of Mathematics	University of Central Florida
Samara Chamoun	Department of Mathematics	Michigan State University
Ana Chavez Caliz	Department of Mathematics	Pennsylvania State University
Alexander Estes	Institute for Mathematics and its Applications	University of Minnesota, Twin Cities
Raymond Friend Jr	Department of Mathematics	Pennsylvania State University
Ghodsieh Ghanbari	Department of Mathematics and Statistics	Mississippi State University
Marc Haerkoenen	School of Mathematics	Georgia Institute of Technology
Tony Haines	Department of Computational and Applied Mathematics	Old Dominion University
Natalie Heer		CH Robinson
Thomas Hoft	Department of Mathematics	University of St. Thomas
Alicia Johnson	Department of Mathematics, Statistics, and Computer Science	Macalester College
Malick Kebe	Department of Mathematics	Howard University (Washington, DC, US)
Juergen Kritschgau	Department of Mathematics	Iowa State University
Marshall Lagani	Department of Data Science	Securian Financial
Kevin Leder	Department of Industrial System and Engineering	University of Minnesota, Twin Cities
Ivan Marin		Cargill, Inc.
Francisco Martinez Figueroa	Department of Mathematics	The Ohio State University
Avishek Mukherjee	Department of Mathematical Sciences	University of Delaware (Newark, DE, US)
Muharrem Otus	Department of Mathematics	University of Pittsburgh
Smita Praharaj	Department of Mathematics	University of Missouri
Tanmay Raj		Cargill, Inc.
Abba Ramadan	Department of Applied Mathematics	University of Kansas
Samanwita Samal	Department of Mathematics	Indiana University
Natalie Sheils		UnitedHealth Group
Blerta Shtylla		Pfizer
David Shuman	Department of Mathematics, Statistics and Computer Science	Macalester College
Lauren Snider	Department of Mathematics	Texas A & M University
Daniel Spirn	University of Minnesota	University of Minnesota, Twin Cities
Elizabeth Sprangel	Department of Mathematics	Iowa State University
Kaisa Taipale	Contractual Pricing Group	CH Robinson
Sijie Tang	Department of Mathematics	University of Wyoming
Cameron Thieme	Department of Mathematics	University of Minnesota, Twin Cities
Shuxian Xu	Department of Mathematics	University of Pittsburgh
Lei Yang	Department of Mathematics	Northeastern University
Grace Zhang	School of Mathematics	University of Minnesota, Twin Cities
Miao Zhang	Department of Mathematics	Louisiana State University
Jennifer Zhu	Department of Mathematics	Texas A & M University
Ahmed Zytoon	Department of Mathematics	University of Pittsburgh

Projects and teams

Team 1 — Cargill: Hydrologic Energy Generation Optimization

Mentor Ivan Marin, Cargill Corporation
Mentor Tanmay Raj, Cargill Corporation
Ana Chavez Caliz, Pennsylvania State University
Francisco Martinez Figueroa, Ohio State University
Juergen Kritschgau, Iowa State University
Avishek Mukherjee, University of Delaware
Smita Praharaj, University of Missouri
Cameron Thieme, University of Minnesota
Jennifer Zhu, Texas A & M University

The increased penetration of variable renewable energy (VRE) and phase-out of nuclear and other conventional electricity generation sources will require an additional flexibility in the power grid and a demand to lower the gap between the generation and demand, and how this can influence the energy pricing in the short and long term. Clean water is essential for hydropower generation, and the main source of electrical power generation in Brazil. Due to the limited water resources and the variability of precipitation, there is a need to investigate an optimal management of these resources in order to meet the power grid demand, and predict the power generation capacity, given the historical rain patterns, reservoir water levels and energy demands.

Team 2 — Securian Financial: Predicting Group Life Client Mortality During a Pandemic

Mentor Douglas Armstrong, Securian Financial
Yuchen Cao, University of Central Florida
Samara Chamoun, Michigan State University
Marc Haerkoenen, Georgia Institute of Technology
Abba Ramadan, University of Kansas
Lei Yang, Northeastern University
Shuxian Xu, University of Pittsburgh

During a pandemic the ability to predict risk for clients becomes paramount to manage risk effectively. The impact that a pandemic has may differ depending on the demographics and regional considerations for each client. This brings in additional complexity to the analysis and forecasting of future risk a client may pose. In this project, students will enrich a simulated client dataset with publicly available data before developing a machine-learning based approach to predict adverse risk of multiple clients.

Team 3 — CH Robinson: Impact of Weather and Agricultural Events on Truckload Cost Per Mile

Mentor Kaisa Taipale, CH Robinson
Raymond Friend Jr, Pennsylvania State University
Ghodsieh Ghanbari, Mississippi State University
Tony Haines, Old Dominion University
Malick Kebe, Howard University
Elizabeth Sprangel, Iowa State University
Grace Zhang, University of Minnesota

Fresh fruits and vegetables are an important group of commodities in the US commonly transported by truck from fields in predominantly southern growing regions across the US (for instance, from California to the Northeast). While irrigation dampens the effect of rainfall crop yields, temperature and rainfall are still important factors in the timing of fresh fruit and vegetable harvest and thus transport. This work will examine the magnitude of impact of vegetable harvest timing on transportation costs, using external inputs like temperature and rainfall as well as variables intrinsic to the truckload market. Challenges include combining the geographic characteristics of the time series involved: univariate time series methods provide some benefit but stronger results come from exploiting geography and freight characteristics. Bayesian models and causal impact analysis are natural tools for this application.

Team 4 — CH Robinson: CH Robinson Volume Simulation

Mentor Natalie Heer, CH Robinson
Mentor Bethany Stai, CH Robinson
Mentor Michael Chmutov, CH Robinson
Mentor Kaisa Taipale, CH Robinson
Muharrem Otus, University of Pittsburgh
Samanwita Samal, Indiana University
Lauren Snider, Texas A & M University
Sijie Tang, University of Wyoming
Miao Zhang, Louisiana State University
Ahmed Zytoon, University of Pittsburgh

In Economics there is classically an inverse relationship between the price of an item and the quantity of the item that customers will choose to purchase. If prices increase, customers will purchase fewer items, and if prices decrease customers will choose to purchase more items. If companies can predict the volume change associated with a change in price, they can optimize their pricing strategy for overall profitability max(Unit Price * Volume). The goal of this project is to help CHR be smarter in optimizing our business strategy.

Winter Math-to-Industry Boot Camp

$2021 Winter Math-to-Industry Boot Camp poster$

Advisory: Application deadline is Friday, December 4, 2020

2021 Winter Virtual Boot Camp poster

Organizers:

Jasmine Foo, University of Minnesota, Twin Cities
Thomas Hoft, University of St. Thomas
Daniel Spirn, University of Minnesota, Twin Cities

The Winter Math-to-Industry Boot Camp is an intensive, two-week program that provides graduate students with training and experience that is valuable for employment outside of academia. The program is targeted at Ph.D. students in mathematics and statistics. The winter camp consists of pre-camp coursework in the basics of programming, data analysis, and optimization.

During the program, students work in small teams under the guidance of an industry mentor using a variety of streaming technology. The mentor and camp staff will help guide the students in the modeling process, analysis, and computational work associated with a real-world industrial problem. Additional time will be spent on developing professional and networking skills, meeting industry scientists, and participating in a career fair.

Each team will be expected to make a final presentation and submit a written report at the end of the workshop.

Recent industrial sponsors included Cargill, D-Wave Systems, the Mayo Clinic, Securian Financial, World Wide Technology.

Eligibility

Applicants must be current graduate students in a mathematical sciences Ph.D. program at a U.S. institution during the period of the boot camp.

Logistics

The program will take place online. Students will receive a $500 stipend.

Applications

To apply, please supply the following materials through the link at the top of the page:

Statement of reason for participation, career goals, and relevant experience
Unofficial transcript, evidence of good standing, and have full-time status
Letter of support from advisor, director of graduate studies, or department chair

Participants

Name	Department	Affiliation
Daniel Alhassan	Department of Mathematics and Statistics	Missouri University of Science and Technology
Mohamed Imad Bakhira	Department of Mathematics	The University of Iowa
Yiqing Cai		Gro Intelligence
Frankie Chan	Department of Mathematics	Purdue University
Jorge Cisneros Paz	Department of Applied Mathematics	University of Washington
Paula Dassbach		Medtronic
Jerry Dogbey-Gakpetor	Statistics	North Dakota State University
Henry Fender	Department of Data Science	ITM TwentyFirst LLC
Shihang Feng	Applied Mathematics and Plasma Physics	Los Alamos National Laboratory
Jasmine Foo	School of Mathematics	University of Minnesota, Twin Cities
Jonathan Hill		ITM TwentyFirst LLC
Thomas Hoft	Department of Mathematics	University of St. Thomas
Salomea Jankovic	Department of Mathematics	University of Minnesota, Twin Cities
Henry Kvinge		Pacific Northwest National Laboratory
Axel La Salle	School of Mathematical and Statistical Sciences	Arizona State University
Youzuo Lin	Earth and Environmental Sciences Division	Los Alamos National Laboratory
Sander Mack-Crane	Department of Mathematics	University of California, Berkeley
Maia Powell	Department of Applied Mathematics	University of California, Merced
Lee Przybylski	Mathematics	Iowa State University
Priyanka Rao	Department of Mathematics & Statistics	Washington State University
Majerle Reeves	Department of Applied Mathematics	University of California, Merced
Daniel Spirn	University of Minnesota	University of Minnesota, Twin Cities
Anna Srapionyan		Merrill Lynch
Wencel Valega Mackenzie	Department of Mathematics	University of Tennessee
Christine Vaughan	Department of Mathematics and Mechanical Engineering	Iowa State University
Elise Walker	Department of Mathematics	Texas A & M University
Max Wimberley	Department of Mathematics	University of California, Berkeley
Harrison Wong	Department of Mathematics	Purdue University
Cancan Zhang	Department of Mathematics	Northeastern University

Projects and teams

Project 1: Record Linkage: Synthesizing Expert Systems and Machine Learning

Mentor Jonathan Hill, ITM TwentyFirst LLC
Mentor Henry Fender, ITM TwentyFirst LLC
Jorge Cisneros Paz, University of Washington
Jerry Dogbey-Gakpetor, North Dakota State University
Majerle Reeves, University of California, Merced
Elise Walker, Texas A & M University
Max Wimberley, University of California, Berkeley
Harrison Wong, Purdue University

Record linkage is a common big data process where shared records in two large datasets are linked based on common fields. Longevity Holdings designed an expert system to automate record linkage between client data and a corpus of death records. This system produces scores that sort record pairs into matches and non-matches. Currently, high and low scores separate cleanly, but mid-tier scores must be manually reviewed. This led us to ask: Can machine learning improve an expert system in record linkage and reduce the size of this review set?

We are working with a variant of the Expectation Maximization (EM) algorithm following the Fellegi-Sunter approach to record linkage. We implemented this algorithm but have not found an optimal configuration for our data. The algorithm is general so we can manipulate many aspects of the input. Our priority is to determine whether there is a configuration that can improve the expert system.

EM is not the only viable approach to this problem. There are a wide range of existing methods that can be applied to record linkage. Our priority is to figure out the pros and cons for each, while trying to exceed EM and expert system performance.

On this project, you will work with real-world data and learn to organize as a team. You will deliver a whitepaper summarizing your process and results. We are most interested in your clear thinking and structured approach to this problem. We will divide into two groups focusing on one of the priorities above. Both groups will receive two validated sets of record pairs, one deriving from obituaries and the other from state and federal records. Our toolset will include python, pandas, and scikit-learn.

Project 2: Data-Driven Computational Seismic Inversion

Mentor Youzuo Lin, Los Alamos National Laboratory
Mentor Shihang Feng, Los Alamos National Laboratory
Frankie Chan, Purdue University
Salomea Jankovic, University of Minnesota, Twin Cities
Sander Mack-Crane, University of California, Berkeley
Priyanka Rao, Washington State University
Christine Vaughan, Iowa State University
Cancan Zhang, Northeastern University

Computational seismic inversion turns geophysical data into actionable information. The technique has been widely used in geophysical exploration to characterize the subsurface structure. Such a clear and accurate map of the subsurface is crucial for determining the location and size of reservoirs and mineral features.

Seismic inversion usually presents itself as an inverse problem. However, solving those inverse problems has been notoriously challenging due to their ill-posed and computationally expensive nature. On the other hand, with advances in machine learning and computing, and the availability of more and better data, there has been notable progress in solving such problems. In our recent work [1, 2], we developed end-to-end data-driven subsurface imaging techniques and produced encouraging results when test data and training data share similar statistics characteristics. The high accuracy of the predictive model is built on the assumption that the training dataset captures the distribution of the target dataset. Therefore, it is critical to obtain a sufficient amount of high-quality training set.

In this project, students will work with LANL scientists to study the impact of the training data on the resulting predictive model. In particular, students will explore and develop different techniques to generate high-quality synthetic data that could be used to enhance the training data quality. Through the project, students will have the opportunity to learn deep learning and its applications in computational imaging and the fundamentals of ill-posed inverse problems.

Reference:

[1]. Yue Wu and Youzuo Lin, “InversionNet: An Efficient and Accurate Data-driven Full Waveform Inversion,” IEEE Transactions on Computational Imaging, 6(1):419-433, 2019.

[2]. Zhongping Zhang and Youzuo Lin, “Data-driven Seismic Waveform Inversion: A Study on the Robustness and Generalization,” in IEEE Transactions on Geoscience and Remote Sensing, 58(10):6900-6913, 2020.

Project 3: The Impact of Climate Change on Crop Yield

Mentor Yiqing Cai, Gro Intelligence
Daniel Alhassan, Missouri University of Science and Technology
Mohamed Imad Bakhira, The University of Iowa
Axel La Salle, Arizona State University
Maia Powell, University of California, Merced
Lee Przybylski, Iowa State University
Wencel Valega Mackenzie, University of Tennessee

Gro is a data platform with comprehensive data sources related to food and agriculture. With data from Gro, stakeholders can make quicker and better decisions. In this project, the students will use data from Gro to quantify the impact of climate change on crop yield, and create visualizations to demonstrate their findings. For example, they can use long-term climate data from Gro, to predict corn yield in Minnesota, 100 years from now. Based on the results, they might be able to conclude that Minnesota will no longer be suitable for growing corn in 100 years, or the areas suitable for corn will shift from the south to the north within Minnesota. Furthermore, they can scale the analysis to the whole globe, and create cool visualizations to show the results.

Data will be provided through Gro API (Python client). For data discovery and visualizations, the students can interact with the Gro web app directly. Once they decide what data to pull from Gro, they can export a code snippet and use the API client to download the data. Data pulled from Gro are in the format of time series, which are called data series. A data series is made up of data points, each with a start and end timestamp. Different data series can come from different sources, and have different frequencies. For example, there are projected monthly precipitation and air temperature from the GFDL B1 model all the way to year 2100, that are available across the whole world.

The deliverables of this project are two-fold: a Jupyter notebook (hosted on Infrastructure provided by Gro) and a visual presentation of the results. It can even be the combination of the two. The Jupyter notebook should be executable end-to-end, from fetching the data from Gro API, to export predictions as files, or as visualizations.

Concluding Remarks

David Goldberg (Purdue University), Phil Kutzko (The University of Iowa), Oscar Vega (California State University)

Plenary Conversation II

Donald Cole (University of Mississippi), David Goldberg (Purdue University), Fabrice Ulysse (University of Notre Dame), Oscar Vega (California State University)

Fields of Success - Stories from Math Alliance Alumni

Julia Anderson-Lee (The Boeing Company), Alexander Diaz-Lopez (Villanova University), April Harry (Rover.com), Anarina Murillo (Brown University), Roberto Soto (California State University), Oscar Vega (California State University)

Report of the Math Alliance Leadership

David Goldberg (Purdue University), Phil Kutzko (The University of Iowa), Kyndra Middleton (Howard University)

Plenary Conversation 1

Ranthony Edmonds (The Ohio State University), Phil Kutzko (The University of Iowa), Victoria Uribe (Arizona State University)

Math-to-Industry Boot Camp V

Advisory: Application deadline is February 28, 2020

Poster

Organizers: The Math-to-Industry Boot Camp is an intense six-week session designed to provide graduate students with training and experience that is valuable for employment outside of academia. The program is targeted at Ph.D. students in pure and applied mathematics. The boot camp consists of courses in the basics of programming, data analysis, and mathematical modeling. Students work in teams on projects and are provided with training in resume and interview preparation as well as teamwork.

There are two group projects during the session: a small-scale project designed to introduce the concept of solving open-ended problems and working in teams, and a "capstone project" that is posed by industrial scientists. Last year's industrial sponsors included Cargill, D-Wave Systems, Exxonmobil, Gro Intelligence, ITM TwentyFirst LLC, World Wide Technology.

Weekly seminars by speakers from many industry sectors provide the students with opportunities to learn about a variety of possible future careers.

Eligibility

Applicants must be current graduate students in a Ph.D. program at a U.S. institution during the period of the boot camp.

Logistics

The program will take place at the IMA on the campus of the University of Minnesota. Students will be housed in a residence hall on campus and will receive a per diem and a travel budget, as well as an $800 stipend.

Applications

To apply, please supply the following materials through the link at the top of the page:

Statement of reason for participation, career goals, and relevant experience
Unofficial transcript, evidence of good standing, and have full-time status
Letter of support from advisor, director of graduate studies, or department chair

Participants

Name	Department	Affiliation
Nawaf Alansari	Department of Mathematics	The Pennsylvania State University
Gabrielle Angeloro	Department of Mathematics	Iowa State University
Skye Binegar	School of Mathematics	Georgia Institute of Technology
Nicole Bridgland		World Wide Technology
Cameron Cook	Department of Mathematics	University of Tennessee
Ryan Coopergard	Department of Mathematics	University of Minnesota, Twin Cities
Erica de la Canal	Department of Mathematics	The University of Texas at Austin
Kari Eifler	Department of Mathematics	Texas A & M University
Nazar Emirov	Department of Mathematics	University of Central Florida
Alexander Estes	Institute for Mathematics and its Applications	University of Minnesota, Twin Cities
Adeyemi Fagbade	Department of Mathematics and Statistics	University of Wyoming
Jasmine Foo	School of Mathematics	University of Minnesota, Twin Cities
Priyanga Ganesan	Department of Mathematics	Texas A & M University
Alketa Henderson		University of North Carolina, Greensboro
Thomas Hoft	Department of Mathematics	University of St. Thomas
Ruihao Huang	OCP/Division of Pharmacometrics	FDA
Yu-Li Huang	Health Care Systems Engineering	Mayo Clinic
Alicia Johnson	Department of Mathematics, Statistics, and Computer Science	Macalester College
Marshall Lagani		Securian Financial
Kevin Leder	Department of Industrial System and Engineering	University of Minnesota, Twin Cities
Chang Li	Department of Mathematics	University of Central Florida
Sarah Miracle	Department of Computer and Information Sciences	University of St. Thomas
Liban Mohamed	Department of Mathematics	University of Wisconsin, Madison
Dhir Patel	Department of Mathematics	The Ohio State University
Hansen Pei	Department of Mathematical Sciences	University of Delaware (Newark, DE, US)
John Portin	Department of Mathematics	University of Kansas
Nilay Shah	Kern Center for the Science of Health Care Delivery	Mayo Clinic
David Shuman	Department of Mathematics, Statistics and Computer Science	Macalester College
Daniel Spirn	University of Minnesota	University of Minnesota, Twin Cities
Yanru Su	Department of Applied and Computational Mathematics	University of Kansas
Radmir Sultamuratov	Department of Mathematics	Wayne State University
Jidong Wang	Department of Mathematics	University of Oregon
Katherine Weber	Department of Mathematics	University of Minnesota, Twin Cities
Zhimin Wu	School of Mathematical and Statistical Sciences	Arizona State University

Projects and teams

Project 1: Modeling equity-linked insurance benefits

Mentor Marshall Lagani, Securian Financial
Gabrielle Angeloro, Iowa State University
Adeyemi Fagbade, University of Wyoming
Priyanga Ganesan, Texas A & M University
Chang Li, University of Central Florida
Liban Mohamed, University of Wisconsin, Madison
Radmir Sultamuratov, Wayne State University
Jidong Wang, University of Oregon

It has become commonplace for insurance companies to offer products that link benefit guarantees to stock market indices, such as the S&P 500. Modeling the risks inherent in such a product requires a strong understanding of mathematical finance as well as significant computational resources. Derivatives instruments, primarily futures, options, and swaps, can be used to hedge the liability, providing an effective mitigation of product risks.
Participants will learn about variable annuities, a common equity-linked product, as well as some of the common derivatives instruments used to hedge the risks in these products. We will explore some of the techniques used to model the liabilities they generate and develop methods to create proxy models, allowing us to monitor risks and rebalance hedge positions intraday as the markets move in between model runs. This project assumes little to no background in mathematical finance and should be of interest to participants who are interested in computational statistics, quantitative finance, and Python.

Project 2: Optimizing warehouse operations

Mentor Nicole Bridgland, World Wide Technology
Cameron Cook, University of Tennessee
Erica de la Canal, The University of Texas at Austin
Kari Eifler, Texas A & M University
Nazar Emirov, University of Central Florida
Hansen Pei, University of Delaware (Newark, DE, US)
John Portin, University of Kansas
Katherine Weber, University of Minnesota, Twin Cities

Supply chain operations motivate many data science and optimization problems. From a demand and pricing perspective, one might ask: how much of item X do we anticipate selling? How much do we expect it to pay for it, depending on when we buy it? From a storage and operations perspective, one might ask how we best store it in warehouses to get it to where it's going. Do we have enough warehouse space for all the stuff we will need to store in the near future? What are the error bars on that space usage estimate? There's plenty of questions from a purely operational perspective as well. For example, in a busy warehouse, forklift traffic can cause significant slowdowns. A forklift at one load or drop-off location may block access to several locations in the warehouse. Forklifts waiting to enter one row could block the major paths through the warehouse. This project is directed at optimizing internal warehouse transit operations, through any of storage location choices, job scheduling, or pathing choices.

Project 3: Bone marrow transplant process modeling and optimization

Mentor Yu-Li Huang, Mayo Clinic
Nawaf Alansari, The Pennsylvania State University
Skye Binegar, Georgia Institute of Technology
Ryan Coopergard, University of Minnesota, Twin Cities
Alketa Henderson, University of North Carolina, Greensboro
Dhir Patel, The Ohio State University
Yanru Su, University of Kansas
Zhimin Wu, Arizona State University

Bone Marrow Transplant (BMT) is an effective treatment for many hematological malignancies. This modality has become integral to the management of many patients resulting in a dramatic increase in the volume of patients undergoing the procedure. The volume of patients coming for transplant (about 500 patients undergo this highly complex procedure annually at Mayo Clinic Rochester) has progressively increased over the past decade leading to many innovative solutions to adapt to this challenge. Over the past two decades the infrastructure has been developed to allow a majority of patients to undergo many components of the procedure as an outpatient visit despite the highly complex nature of the patients and associated risk of complications. Ultimately we have reached maximum safe capacity with our current workflow. This has posed major stresses on many areas including patient scheduling, stem cell collection, outpatient visit, human cellular therapy laboratory, hospital based outpatient facility, and inpatient facility. BMT practice has recently implemented a predictive model for stem cell collections. This model is expected to increase capacity by 20% with the same resources. The practice also adopted pre scheduling concept to plan for entire patient transplant itinerary starting from stem cell collections, pre-chemo visits, to chemo treatment and stem cell infusion. There are uncertainties in all three stages due to patient conditions, resource constraints, and process complexity. This short term project will focus on modeling and optimizing the stochastic nature of these three stages which could potentially provide recommendations for scheduling policy and resource planning.

Math-to-Industry Boot Camp IV

Advisory: Extended application deadline is March 22, 2019

Poster

Organizers:

Benjamin Brubaker, University of Minnesota, Twin Cities
Fadil Santosa, University of Minnesota, Twin Cities
Daniel Spirn, University of Minnesota, Twin Cities

There are two group projects during the session: a small-scale project designed to introduce the concept of solving open-ended problems and working in teams, and a "capstone project" that is posed by industrial scientists. Last year's industrial sponsors included 3M, D-Wave Systems, Milwaukee Brewers, National Security Technologies, Schlumberger-Doll Research, and Whitebox Advisors.

Weekly seminars by speakers from many industry sectors provide the students with opportunities to learn about a variety of possible future careers.

Eligibility

Applicants must be current graduate students in a Ph.D. program at a U.S. institution during the period of the boot camp.

Logistics

Applications

To apply, please supply the following materials through the link at the top of the page:

Statement of reason for participation, career goals, and relevant experience
Unofficial transcript, evidence of good standing, and have full-time status
Letter of support from advisor, director of graduate studies, or department chair

Participants

Name	Department	Affiliation
Jesse Berwald		D-Wave Systems
Nicole Bridgland		World Wide Technology
Benjamin Brubaker	School of Mathematics	University of Minnesota, Twin Cities
Yiqing Cai		Gro Intelligence
Sarah Chehade	Department of Mathematics	University of Houston
Brendan Cook		University of Minnesota, Twin Cities
William Cooper	Department of Mechanical Engineering	University of Minnesota, Twin Cities
Steven Dabelow	Department of Applied and Computational Mathematics and Statistics	University of Notre Dame
Davood Damircheli	Department of Mathematics and Statistics	Mississippi State University
Dilek Erkmen	Department of Mathematical Science	Michigan Technological University
Jonathan Hahn		World Wide Technology
Jordyn Harriger	Department of Mathematics	Indiana University
Brad Hildebrand		Cargill, Inc.
Jonathan Hill		ITM TwentyFirst LLC
Thomas Hoft	Department of Mathematics	University of St. Thomas
SeongHee Jeong		Louisiana State University
Michael Johnson	Strategic Marketing and Portfolio Division	Cargill, Inc.
Kiwon Lee	Department of Mathematics	The Ohio State University
Xing Ling	Department of Mathematical Science	Michigan Technological University
Sijing Liu	Department of Mathematics	Louisiana State University
Kevin Marshall	Department of Mathematics	University of Kansas
Kristina Martin	Department of Supervision, Regulation, and Credit	Federal Reserve Bank of Minneapolis
Vikenty Mikheev	Department of Mathematics	Kansas State University
Sarah Milstein		University of Minnesota, Twin Cities
Sarah Miracle	Department of Computer and Information Sciences	University of St. Thomas
Bibekananda Mishra	Department of Mathematics	University of Kansas
Whitney Moore	Career Center for Science and Engineering	University of Minnesota, Twin Cities
Anthony Nguyen	Department of Mathematics	University of California, Davis
Damilola Olabode	Department of Mathematics and Statistics	Washington State University
Negar Orangi-Fard	Department of Mathematics	Kansas State University
Samantha Pinella	Department of Mathematics	University of Michigan
Michelle Pinharry	School of Mathematics	University of Minnesota, Twin Cities
Puttipong Pongtanapaisan	Department of Mathematics	The University of Iowa
Matthew (Jake) Roberts	Department of Mathematical Sciences	Michigan Technological University
Jose Pedro Rodriguez Ayllon	Department of Mathematics	University of Houston
Nandita Sahajpal	Department of Mathematics	University of Kentucky
Fadil Santosa	School of Mathematics	University of Minnesota, Twin Cities
Samantha Schumacher	Department of Data Science & Analysis	Target Corporation
Olabanji Shonibare		Starkey Hearing Technologies
David Shuman	Department of Mathematics, Statistics and Computer Science	Macalester College
Matthew Sikkink Johnson	Department of Mathematics	University of Minnesota, Twin Cities
Daniel Spirn	University of Minnesota	University of Minnesota, Twin Cities
Rebeccah Stay		Cargill, Inc.
Ben Strasser	Department of Mathematics	University of Minnesota, Twin Cities
Rahim Taghikhani	School of Mathematics and Statistics	Arizona State University
Zeinab Takbiri	Department of Engineering R&D and Data Science	Cargill, Inc.
Tianyu Tao	Department of Mathematics	University of Minnesota, Twin Cities
Jing Wang		Thrivent Financials
Nathan Willis	Department of Mathematics	The University of Utah
Guanglin Xu	Institute for Mathematics and its Application	University of Minnesota, Twin Cities
Yanhua Yuan		ExxonMobil
Christina Zhao		University of Minnesota, Twin Cities
Li Zhu	Department of Mathematical Sciences	University of Nevada

Projects and teams

Project 1: Rail car supply forecasting

Mentor Zeinab Takbiri, Cargill, Inc.
Sijing Liu, Louisiana State University
Damilola Olabode, Washington State University
Puttipong Pongtanapaisan, The University of Iowa
Nathan Willis, The University of Utah

Cargill is a major grain trader in the US. We utilize over 100,000 rail cars per year to ship grains to our domestic and export customers. Cargill uses railroad-supplied cars to move a lot of these shipments of grain. The railroads require us to take on an obligation to run their cars for a year. We are looking for help in developing a supply and demand model that can determine how many cars Cargill should take on in a given year as well as a forecast of the overall market’s need for railroad owned equipment.

Project 2: Accuracy of a simple freeze-out model as a description of the QPU distribution for C4 RAN1 problems

Mentor Jesse Berwald, D-Wave Systems
Sarah Chehade, University of Houston
Davood Damircheli, Mississippi State University
Kevin Marshall, University of Kansas
Li Zhu, University of Nevada

A quantum processing unit (QPU) is a programmable chip that leverages superposition and entanglement, fundamental quantum mechanical properties, to solve problems. The D-Wave quantum annealing computer currently operates with a 2048-qubit QPU. Calibrating such a chip in the presence of thermal, quantum mechanical, and design-specific noise is a critical component to producing a working quantum computer.

D-Wave Systems has developed many internal calibration tests to infer anomalies observed in the QPU. Error correction on many levels is used to mitigate these anomalies wherever possible (though thermal and quantum fluctuations will always be present). The variety of tests often requires different models and statistical methods. This project looks at a test of a specific configuration of randomly coupled qubits (C4 RAN1). Students will implement and fit a model based on observations from the QPU. A significant part of the pipeline will include a visualization component to enable easy, and deeper, analysis of anomalies if they are present.

Project 3: Improving Mine Dispatching

Mentor Nicole Bridgland, World Wide Technology
Mentor Jonathan Hahn, World Wide Technology
Steven Dabelow, University of Notre Dame
Jordyn Harriger, Indiana University
SeongHee Jeong, Louisiana State University
Kiwon Lee, The Ohio State University

Mines have lots of moving parts, and timing of delivery between them is crucial. Time that mining equipment spends idle represents lost production opportunity. Time trucks spend idle, while not as obviously problematic, represents at least wasted fuel if not lost production opportunity elsewhere in the mine. Given a system of several shovels and crushers, and trucks moving material between them, how can you best decide where to send empty/loaded trucks as they become available? When equipment experiences delays, when should you reroute trucks vs simply wait it out, and how should you reroute them? The goal of this project will be to develop tools to help human dispatchers make these decisions, possibly in the form of machine-generated recommendations.

Project 4: Analogous year detection

Mentor Yiqing Cai, Gro Intelligence
Xing Ling, Michigan Technological University
Ben Strasser, University of Minnesota, Twin Cities
Rahim Taghikhani, Arizona State University
Tianyu Tao, University of Minnesota, Twin Cities

Gro is a data platform with comprehensive data sources related to food and agriculture. With data from Gro, stakeholders can make quicker and better decisions, which in most cases are time sensitive. In this project, the students will use data from Gro to identify analogous events. For example, people can compare and find a year with similar precipitation and soil moisture patterns to draw inferences about second and third order effects such as flooding or decreased crop planted area. This type of analysis can help quantify the impact of an event, and remedy the negative impact if it is severe and not avoidable.

Data will be provided through Gro API. Data pulled from Gro are in the format of time series, which are called data series. Different data series can come from different sources, and have different frequencies. For example, there is daily Precipitation data from TRMM, and NDVI at a frequency of 8 days (a type of vegetation index) from GIMMS MODIS.

Goals: The deliverables of this project will be in the form of an executable model. Given a data series (or a set of data series), and a selected time period, find analogous periods in history that are most similar to this selected period. Given the project goal, it all boils down to defining similarity between a pair of data series, or concatenated data series.

Project 5: Deblending simultaneous-source seismic signals

Mentor Yanhua Yuan, ExxonMobil
Dilek Erkmen, Michigan Technological University
Anthony Nguyen, University of California, Davis
Samantha Pinella, University of Michigan
Jose Pedro Rodriguez Ayllon, University of Houston
Nandita Sahajpal, University of Kentucky

Acquisition of seismic data in marine environment is a costly process. Traditionally, in marine seismic surveys, a boat tows a line of receivers while moving slowly. To obtain signals at the receivers, a wave source, typically an air gun, is generating a pulse with frequencies in the 10 of Hz which penetrates the earth and reflects back on the different layers of the earth. Recently, an innovation in this space was introduced that has been shown to have substantial savings and allowed for wider distances between the source and the receivers. In the new method, more than one seismic sources or air guns are fired with short or zero delays between them so that the signal generated by each source overlap at some or all receivers. The collected signals at the receivers are therefore blended together in simultaneous-source acquisition, and a “deblending” process is usually needed to separate signals from the individual sources before any further analysis. To make it easier for decoding, multiple sources are usually fired at a random time, and (or) with signatures coded differently. Based on the incoherence assumption, the deblending problem can be explored in different ways, including as signal processing problem, inversion problem, or data analytics problem. In this project, we will try these methods and look for a robust deblending algorithm to reconstruct individual source signals from encoded data.

Project 6: Accuracy and precision of Time-to-Event Models with Flexible Dimensionality

Mentor Jonathan Hill, ITM TwentyFirst LLC
Brendan Cook, University of Minnesota, Twin Cities
Vikenty Mikheev, Kansas State University
Bibekananda Mishra, University of Kansas
Negar Orangi-Fard, Kansas State University
Matthew (Jake) Roberts, Michigan Technological University

Medical underwriting is expensive and time-consuming, involving trained underwriters who manually review medical history and long delays waiting for documentation. For these reasons, researchers in life insurance and related industries are fervently searching for methods to estimate mortality risk faster and at lower cost.

One proposed solution is to use a smaller set of medical features than what is typically collected in underwriting. These features could be collected through a questionnaire and used to generate a rapid estimate of mortality risk. This solution could have additional value in cases of full underwriting where some medical data is missing. A key objective will be quantifying the increase in uncertainty, or decrease in precision, as a consequence of using a smaller feature set.

During this week-long project, you will take a crash course in survival analysis, explore models for time-to-event data (including traditional and machine learning approaches), determine appropriate metrics, engineer features, and compete to create the best possible model of mortality risk. If time allows, there may be opportunity to develop novel modelling techniques.

We will be using a unique world-class dataset on senior life outcomes provided by ITM TwentyFirst, a Minneapolis-based life settlements servicing company.

Math-to-Industry Boot Camp III

Advisory: Application deadline is February 28, 2018

Organizers:

Benjamin Brubaker, University of Minnesota, Twin Cities
Fadil Santosa, University of Minnesota, Twin Cities
Daniel Spirn, University of Minnesota, Twin Cities

Weekly seminars by industrial scientists provide the students with opportunities to learn about a variety of possible future careers.

Eligibility

Applicants must be current graduate students in a Ph.D. program at a U.S. institution during the period of the boot camp.

Logistics

Applications

To apply, please supply the following materials through the link at the top of the page:

Statement of reason for participation, career goals, and relevant experience
Unofficial transcript, evidence of good standing, and have full-time status
Letter of support from advisor, director of graduate studies, or department chair

Participants

Name	Department	Affiliation
Muhammad Afridi		3M
Nicholas Asendorf		3M
Christopher Bemis		Whitebox Advisors
Nitsan Ben-Gal	Software, Electronics and Mechanical Systems Laboratory	3M
Jesse Berwald		D-Wave Systems
Ariel Bowman	Department of Mathematics	University of Texas at Arlington
Chris Browne	Center for Applied Mathematics	Cornell University
Benjamin Brubaker	School of Mathematics	University of Minnesota, Twin Cities
Kate Brubaker	Department of Mathematics	Purdue University
Irfan Bulu	Department of Math and Modeling	Schlumberger-Doll Research
Shawn Burkett	Mathematics	University of Colorado
Olivia Cannon	Department of Mathematics	University of Minnesota, Twin Cities
Jared Catenacci	Diagnostic Research and Material Studies	National Security Technologies, LLC
Chirasree Chatterjee	Department of Mathematics and Statistics	Saint Louis University
Hua Chen	Department of Mathematical Sciences	University of Delaware
Aaron Cohen	Department of Mathematics	Indiana University
Paula Dassbach		Medtronic
Mingchang Ding	Department of Mathematical Sciences	University of Delaware
Jasmine Foo	School of Mathematics	University of Minnesota, Twin Cities
Zhen Gao	Department of Mathematics	Vanderbilt University
Maria Gommel	Department of Mathematics	The University of Iowa
Hayley Guy	School of Mathematics	North Carolina State University
Qie He	Department of Industrial and Systems Engineering	University of Minnesota, Twin Cities
Thomas Hoft	Department of Mathematics	University of St. Thomas
Ruihao Huang	Department of Mathematical Sciences	Michigan Technological University
Jeffrey Humpherys		UnitedHealth Group
Laura Iosip	Department of Mathematics	University of Maryland
Melanie Jensen	Department of Mathematics	Tulane University
Alicia Johnson		Macalester College
Ekaterina Kryuchkova	Center for Applied Mathematics	Cornell University
Kevin Leder	Department of Industrial System and Engineering	University of Minnesota, Twin Cities
Philku Lee	Department of Mathematics and Statistics	Mississippi State University
SangJoon Lee	Department of Mathematics	University of Connecticut
Hengguang Li	Department of Mathematics	Wayne State University
Aaron Luttman	Diagnostic Research and Material Studies	National Security Technologies, LLC
Christopher Miller	School of Mathematics	University of California, Berkeley
Cristian Minoccheri	Department of Mathematics	State University of New York, Stony Brook (SUNY)
Sarah Miracle	Department of Computer and Information Sciences	University of St. Thomas
Shannon Negaard-Paper		University of Minnesota, Twin Cities
Elpiniki Nikolopoulou	Department of Applied Mathematics and Statistics	Arizona State University
Michelle Pinharry	School of Mathematics	University of Minnesota, Twin Cities
Iurii Posukhovskyi	Department of Mathematics	University of Kansas
Mrinal Raghupathi	USAA Asset Management Company	USAA Asset Management Company
Michael Ramsey	Department of Applied Mathematics	University of Colorado
Eric Roberts	Department of Applied Mathematics	University of California, Merced
Tanushree Roy	School of Mathematics	University of Central Florida
Keith Rush	Department of Strategy and Analytics	Milwaukee Brewers
Fadil Santosa	School of Mathematics	University of Minnesota, Twin Cities
Chang Shu	Department of Applied Mathematics	University of California, Davis
Dallas Smith	School of Mathematics	Brigham Young University
Alberto Speranzon	Aerospace	Honeywell
Daniel Spirn	University of Minnesota	University of Minnesota, Twin Cities
Binh Tang	Department of Statistical Science	Cornell University
Elizabeth Wicks	School of Mathematics	University of Washington
Shiqiang Xia		University of Minnesota, Twin Cities
Di Ye		Zhennovate
Yufei Yu	Department of Mathematics	University of Kansas
Sheng Zhang	Department of Mathematics	Purdue University

Projects and teams

Team 1: Mathematical Models for Adaptive Multi-modal Sensing

Mentor Aaron Luttman, National Security Technologies, LLC
Mentor Jared Catenacci, National Security Technologies, LLC
Ariel Bowman, University of Texas at Arlington
Shawn Burkett, University of Colorado
Hayley Guy, North Carolina State University
Laura Iosip, University of Maryland
Yufei Yu, University of Kansas
Sheng Zhang, Purdue University

Scientific experiments are a natural source of data – which usually means diagnostic systems fielded to collect information within the experiments themselves – but there has been a recent trend towards collecting data around big science experiments to understand if we can detect and characterize the behaviors associated with the experiments. The question is whether it is possible to determine what experiments are being conducted by analyzing human patterns, so-call “patterns of life,” around and in the experimental facilities. In order to measure patterns of life, we analyze many different types of data, from power grid load profiles to internet activity to sound and pressure signals from cars.

There are two primary challenges that must be addressed:

Mathematical Models for Adaptive Sensing – When should a sensor system turn on its sensors and transmit its data, given that these two activities take a lot of power?

Physics-based Multi-modal Feature Selection and Detection – How can one incorporate physics models for sensing into machine learning approaches to data analysis?

Real multi-sensor data will be provided for testing and validation.

Team 2: Quantum Computation and QUBO Slicing

Mentor Jesse Berwald, D-Wave Systems
Olivia Cannon, University of Minnesota, Twin Cities
Tanushree Roy, University of Central Florida
Chang Shu, University of California, Davis
Dallas Smith, Brigham Young University
Elizabeth Wicks, University of Washington

Background

Quantum annealing computers have begun to enter the business and academic worlds. Over the past five years they have been used for a wide variety of (prototypical) applications, with evidence of differentiated performance in some cases.

A first step in utilizing these computers is to reformulate the problem in an energy minimization framework. This is typically cast as a Hamiltonian, or alternatively as a quadratic unconstrained binary optimization (QUBO), which can be represented as a matrix. These formulations are translated to the physical qubits on the quantum processing unit (QPU) through a process termed “embedding”. Embedding a given problem onto the QPU is handled through a number of different heuristics and is an active area of research in itself, one of which is described below.

Problem statement

In this project we will investigate one proposed solution to the embedding problem:

The goal is to make the most efficient use of the qubit hardware by developing a parameterized transformation from the space spanned by physical qubits, “qubit space”, to the space spanned by problem variables, the “problem search space”. Our goal will be to define a linear transformation from qubit space to problem search space that allows for a more efficient use of available hardware.

Since the problem space is (in general) much larger than the qubit space, a fixed parameterization will succeed in mapping the qubit space into an proper subspace of the problem space. We term these subspaces “slices”. This reduced problem can then be solved with an optimal use of the available hardware. Using different parameterizations, we can define a series of linear transformations onto orthogonal subspaces of the problem space.

There are many parameterizations to choose from, each of which raises a number of research questions. We will prioritize our investigation roughly as follows:

Given a QUBO matrix defining the problem search space, is there an algorithm that produces the most efficient set of transformations (parameterizations) from qubit space to problem space?
Is there a greedy algorithm that is best in practice — i.e. choose a slice that maximizes the use of the chip, and then choose successively smaller slices to query the entire search space.
What is the role of sparsity in the choice of transformations?
The QPU itself has a unique architecture. How does this architecture affect the choice of transformations?

References

Traffic flow optimization using a quantum annealer: https://arxiv.org/pdf/1708.01625.pdf
A NASA Perspective on Quantum Computing: Opportunities and Challenges: https://arxiv.org/pdf/1704.04836.pdf

Team 3: Time Series Analysis of Gas Mixture Data

Mentor Nicholas Asendorf, 3M
Kate Brubaker, Purdue University
Ruihao Huang, Michigan Technological University
Philku Lee, Mississippi State University
Elpiniki Nikolopoulou, Arizona State University
Michelle Pinharry, University of Minnesota, Twin Cities

Motivation

Sensor networks are ubiquitous in today’s Internet of Things, capable of collecting high frequency data in a cost efficient way. This results in mountains of time-series data that hopefully contain signals of interest buried in noise. As the number of deployed sensors grows, so does the dimensionality of the observed data, further increasing the complexity of the problem. 3M is interested in such large scale time series analyses because many of our datasets can be framed in this way: manufacturing, sales, and chemical experiments to name a few.

Dataset

This publicly available dataset contains time series sensor readings from chemical sensors over the duration of 12 hours. The input to these sensors are known concentrations of various gases. The dataset contains timestamped measurements from 16 gas sensors and the input concentrations of the gases. This is a labeled time series dataset. There are two different gas mixture measurement files, one for Ethylene and CO, and one for Ethylene and Methane. At 3M, we may have similar types of experimental data (perhaps using different sensors) where we would like to determine the interactions between materials or understand fundamental properties of materials. Being able to intelligently and efficiently mine these rich datasets for insights about material characteristics is critical.

The Challenge

Some interesting problems to consider:

Develop an algorithm to estimate the concentration of each gas given sensor measurements. You might approach this problem using classical machine learning, splitting data into training, validation, and testing, while treating time series measurements as independent points.
Develop algorithms to estimate the concentrations of each gas using time series based methods like windowing, tsfresh, or RNNs. In this approach, we don’t want to treat each measurement as independent. How do these algorithms compare to classical machine learning techniques?
Can you use the fact that we have 4 replicates of each sensor at each time point to improve your algorithms? Can you use any clever data fusion techniques or outlier detection strategies?
What can you tell about the importance or accuracy of the 4 types of sensors used?
What happens when we purposely introduce missing data? Can we use the replicates of each sensor to overcome this? How robust are your algorithms to missing data?
Since each dataset has measurements for Ethylene, can we use both datasets to develop a more robust estimation scheme for that gas?

Team 4: Structured Variational Auto Encoders

Mentor Irfan Bulu, Schlumberger-Doll Research
Hua Chen, University of Delaware
Aaron Cohen, Indiana University
Mingchang Ding, University of Delaware
Melanie Jensen, Tulane University
Christopher Miller, University of California, Berkeley
Michael Ramsey, University of Colorado

Generative models such as Variational Auto Encoders (VAE), Generative Adversarial Networks(GAN) have been very successful in unsupervised learning settings. In a VAE setting, we would like to learn a set of latent variables that explain our data. Although, this has been very successful as a generative model, the interpretation of latent variables is still a challenge. Ideally, what we would like to do is unsupervised learning through which we identify a number of classes (not specified yet). Once a set of classes has been identified, we can then label once instead of having to label the entire data set. Imagine you have a sample of handwritten digits without labels. If we can structure VAE in a way that it can identify 10 classes, we can then go label these classes as the relevant digits. This would be very helpful as most of our data is unlabeled or poorly labeled.

Concepts that may be helpful to know: neural network, generative models, graphical models, stochastic variational inference.

Team 5: Tailored Discovery in Stock Portfolios

Mentor Christopher Bemis, Whitebox Advisors
Chirasree Chatterjee, Saint Louis University
Zhen Gao, Vanderbilt University
Cristian Minoccheri, State University of New York, Stony Brook (SUNY)
Shannon Negaard-Paper, University of Minnesota, Twin Cities
Shiqiang Xia, University of Minnesota, Twin Cities

Modern portfolio theory has provided tools to identify systemic and idiosyncratic risks via models like Markowitz' Mean-Variance Optimization. In addition, a taxonomy of equities has emerged through feature identification, with one of the earliest and most impactful being Fama and French's three factor model.

In this project, we will leverage technical and fundamental data like return series and earnings information along with well understood equity features like exposure to so-called size, value, and market portfolios to develop tools for suggesting supplements (e.g., technology stocks when looking at Apple) and complements (e.g., energy stocks when looking at Delta Airlines) for individual equities and portfolios. These tools may be used in tailored discovery and research by analysts looking to either construct a portfolio based on a theme or to diversify. The work will ideally evolve from point estimates using simple norms in a predetermined feature space to applying machine learning techniques.

Data will be supplied from Quandl, and the preferred language for development will be Python.

Team 6: Sequence-to-sequence modeling for the business of baseball

Mentor Keith Rush, Milwaukee Brewers
Maria Gommel, The University of Iowa
Ekaterina Kryuchkova, Cornell University
SangJoon Lee, University of Connecticut
Iurii Posukhovskyi, University of Kansas
Eric Roberts, University of California, Merced

Each fan has a unique relationship to his or her favorite sports teams, and each has a different ideal every time they step into the stadium. When a team makes a big free-agent signing in February, the fan who follows he competition closely will be ecstatic--the fan who primarily enjoys the communal aspects will only see this effect in the buzz generated in his or her social circles. In order to cherish their fans to the utmost, teams must have a global view of their business and be able to structure data from all sources and across all levels of granularity, creating one universe into which all inputs and from which all outputs feed.

This project is fundamentally a first step in that direction. The problem we are focusing on is roughly the following: conditioned on a vector representing a fan's history with the Club and the attributes of a particular game, how well can we ingest information in time and map it forward one time step. For this purpose, we will test the standard recurrent and convolutional network architectures, as well as experimenting with variants and discussing the reasons for applying each and their limitations. Data will be provided from the Brewers and the development will take place in Python, utilizing cloud infrastructure for the computing power.