Web: http://www.ima.umn.edu | Email: ima-staff@ima.umn.edu | Telephone: (612) 624-6066 | Fax: (612) 626-7370

### March 2007

2006-2007 Program

### Applications of Algebraic Geometry

See http://www.ima.umn.edu/2006-2007 for a full description of the 2006-2007 program on Applications of Algebraic Geometry.

News and Notes

New Directions Research Professors

The IMA is pleased to announce that the New Directions Research Professors for the 2007-2008 thematic program, Mathematics of Molecular and Cellular Biology, will be Tiefeng Jiang, Department of Statistics, University of Minnesota, Debra Knisley, Department of Mathematics, Eastern Tennessee State University, and Zhijun Wu, Department of Mathematics,University of Iowa. New Directions Research Professorships provide an extraordinary opportunity for established mathematicians-typically mid-career faculty at US universities-to spend the academic year at the IMA. The visiting Professors will enjoy an excellent research environment and stimulating scientific program, learning about and contributing to exciting new developments in mathematical biology and a broad range of applications.

New publication on mixed integer programming

A special issue of the journal Discrete Optimization based on the IMA workshop Mixed-Integer Programming, July 25-29, 2005, has just appeared.

Summer programs

This summer the IMA will offer four exciting programs. One is intended for mid-career mathematicians interested in exploring a new research direction, and two are exclusively for graduate students.

The 2007 IMA Summer Program Classical and Quantum Approaches in Molecular Modeling, July 23-August 3, 2007, is devoted to computational molecular modeling both via classical approaches (emphasized in week 1) and quantum approaches (emphasized in week 2). Both weeks will include talks at the introductory level. The lectures during the first week will be given by Giovanni Ciccotti, Ben Leimkuhler, Robert Skeel, and Mark Tuckerman while the lectures during the second week will be given by Eric Cancès, and Youssef Saad.

The New Directions Short Course Compressive Sampling and Frontiers in Signal Processing June 4 - 15, 2007 is taught by Emmanuel Candes, Ron DeVore and Rich Barniuk. This exploding area of research has connections to many areas of mathematics and presents many research opportunities. The IMA is currently accepting applications for this Short Course. No prior background in signal processing is expected. Participants will receive full travel and lodging support during the workshop. Participation is by application only and will be limited to 25 midcareer researchers in the mathematical sciences. The application deadline is April 1.

PI Summer Program for Graduate Students Applicable Algebraic Geometry will be held at Texas A&M University in College Station, Texas, July 23-August 10, 2007. This program is open only to graduate students from IMA Participating Institutions. In order to participate, students need to fill out the application form and need to be nominated by their department chair by April 15, 2007. The course will concentrate on Applicable Algebraic Geometry. If you know of a graduate student at a Participating Institution who would benefit from the program, please encourage them to contact their department chair.

Mathematical Modeling in Industry Mathematical Modeling in Industry XI - A Workshop for Graduate Students, August 8-17, 2007 offers graduate students and qualified advanced undergraduates first hand experience in industrial research. Teams of up to six students will work under the guidance of a mentor from industry, who will guide the students in the modeling and analysis of real-world industrial problems. This year's mentors are Natalia M. Alexandrov (NASA), Radu Balan (Siemens), Gary B. Green (The Aerospace Corporation) John R. Hoffman (Lockheed Martin), Mark A. Stuff (General Dynamics), and Lisa Zhang (Lucent Technologies, Bell Laboratories). The application deadline is April 15.

IMA is seeking a new director: The IMA is looking for a new director to begin in summer 2008.

IMA Events

### Applications in Biology, Dynamics, and Statistics

#### March 5-9, 2007

Organizers: Lior Pachter (University of California), Seth Sullivant (Harvard University)

Schedule

## Thursday, March 1

 9:45a-10:45a Algebraic statistics learning seminar Lind Hall 409 11:15a-12:15p Real algebraic geometry tutorial: Real rings Kenneth R. Driessel (Iowa State University) Lind Hall 409 RAG 3:20p-4:20p From Drosophila and transposable elements to phylogenetic networks and associahedra Lior Pachter (University of California) Vincent Hall 16

## Monday, March 5

 8:15a-9:00a Registration and coffee EE/CS 3-176 W3.5-9.07 9:00a-9:15a Welcome and introduction Douglas N. Arnold (University of Minnesota Twin Cities) EE/CS 3-180 W3.5-9.07 9:15a-10:05a Open problems in algebraic statistics Bernd Sturmfels (University of California) EE/CS 3-180 W3.5-9.07 10:05a-10:40a Coffee EE/CS 3-176 W3.5-9.07 10:40a-11:30a Likelihood ratio tests and singularities Mathias Drton (University of Chicago) EE/CS 3-180 W3.5-9.07 11:30a-1:30p Lunch W3.5-9.07 1:30p-2:20p Stability and instability in polynomial equations arising from complex chemical reaction networks: the big picture Martin Feinberg (Ohio State University) EE/CS 3-180 W3.5-9.07 2:20p-2:50p Coffee EE/CS 3-176 W3.5-9.07 2:50p-3:40p Stability and instability in polynomial equations arising from complex chemical reaction networks: some underlying mathematics Gheorghe Craciun (University of Wisconsin) EE/CS 3-180 W3.5-9.07 3:40p-4:10p Second Chances EE/CS 3-180 W3.5-9.07 4:10p-4:30p Group Photos EE/CS 3-180 W3.5-9.07 4:30p-6:30p IMA Tea and Poster Session Lind Hall 400 W3.5-9.07 A flow that computes the best positive semi-definite approximation of a symmetric matrix Kenneth R. Driessel (Iowa State University) Multiple solutions to the likelihood equations in the Behrens-Fisher problem Mathias Drton (University of Chicago) Metric learning for phylogenetic invariants Nicholas Eriksson (Stanford University) Geometry of rank tests Jason Morton (University of California) Supervised learning artifical neural network algorithms for optimizing mechanical properties of elastin-like polypeptide hydrogels for cartilage repair Sarah Olson (North Carolina State University) Toric ideals of phylogenetic invariants for the general group-based model on claw trees Sonja Petrovic (University of Kentucky) Conditional independence for Gaussian random variables is not finitely axiomatizable Seth Sullivant (Harvard University) Linkage problems and real algebraic geometry Thorsten Theobald (Johann Wolfgang Goethe-Universität Frankfurt) Classifying disease models using regular polyhedral subdivisions Debbie Yuster (Columbia University) Maximum likelihood estimation in latent class Yi Zhou (Carnegie-Mellon University)

## Tuesday, March 6

 8:30a-9:00a Coffee EE/CS 3-176 W3.5-9.07 9:00a-9:50a Phylogenetic models: algebra and evolution Elizabeth S. Allman (University of Alaska) EE/CS 3-180 W3.5-9.07 9:50a-10:30a Coffee EE/CS 3-176 W3.5-9.07 10:30a-11:20a Generalized maximum likelihood estimates for exponential families František Matúš (Institute of Information Theory and Automation) EE/CS 3-180 W3.5-9.07 11:20a-1:30p Lunch W3.5-9.07 1:30p-2:20p Bifurcations in coupled systems Martin Golubitsky (University of Houston) EE/CS 3-180 W3.5-9.07 2:20p-3:00p Coffee EE/CS 3-176 W3.5-9.07 3:00p-3:50p Application of algebraic statistics for statistical disclosure limitation Aleksandra B. Slavković (Pennsylvania State University) EE/CS 3-180 W3.5-9.07 4:00p-4:30p Second Chances EE/CS 3-180 W3.5-9.07

## Wednesday, March 7

 8:30a-9:00a Coffee EE/CS 3-176 W3.5-9.07 9:00a-9:50a Polynomial dynamical systems over finite fields, with applications to modeling and simulation of biological networks Reinhard Laubenbacher (Virginia Polytechnic Institute and State University) EE/CS 3-180 W3.5-9.07 9:50a-10:30a Coffee W3.5-9.07 10:30a-11:20a Gene interactions and the geometry of fitness landscapes Niko Beerenwinkel (Harvard University) EE/CS 3-180 W3.5-9.07 11:30a-12:00p Second Chances EE/CS 3-180 W3.5-9.07 5:00p-6:30p Reception Lind Hall 400 W3.5-9.07 7:00p-8:00p Math matters - IMA public lecture: Patterns patterns everywhere Martin Golubitsky (University of Houston) Willey Hall 125 PUB3.07.07 7:00p-8:00p Math Matters - IMA Public Lecture: Patterns Patterns Everywhere Martin Golubitsky (University of Houston) Willey Hall 125 W3.5-9.07

## Thursday, March 8

 8:30a-9:00a Coffee EE/CS 3-176 W3.5-9.07 9:00a-9:50a Gaussian path diagrams Thomas S. Richardson (University of Washington) EE/CS 3-180 W3.5-9.07 9:50a-10:30a Coffee W3.5-9.07 10:30a-11:20a Using algebraic geometry for phylogenetic reconstruction Marta Casanellas (Polytechnical University of Cataluña (Barcelona)) EE/CS 3-180 W3.5-9.07 11:20a-1:30p Lunch W3.5-9.07 1:30p-2:20p Modelling with mass-action kinetics and beyond Markus Kirkilionis (University of Warwick) EE/CS 3-180 W3.5-9.07 2:20p-3:00p Coffee EE/CS 3-176 W3.5-9.07 3:00p-3:50p Information geometry and algebraic statistics Giovanni Pistone (Politecnico di Torino) EE/CS 3-180 W3.5-9.07 4:00p-4:30p Second Chances W3.5-9.07 6:30p-8:00p Dinner: Caspian Bistro Restaurant 2418 University Ave SE612-623-1133 W3.5-9.07

## Friday, March 9

 8:30a-9:00a Coffee EE/CS 3-176 W3.5-9.07 9:00a-9:50a On phylogenetic trees – a geometer's view Jaroslaw Wisniewski (University of Warsaw) EE/CS 3-180 W3.5-9.07 9:50a-10:30a Coffee W3.5-9.07 10:30a-11:20a A combinatorial test for significant codivergence between cool-season grasses and their symbiotic fungal endophytes Ruriko Yoshida (University of Kentucky) EE/CS 3-180 W3.5-9.07 11:20a-1:30p Lunch W3.5-9.07 1:30p-2:20p Subspace arrangements in theory and practice Robert M. Fossum (University of Illinois at Urbana-Champaign) EE/CS 3-180 W3.5-9.07 2:20p-3:00p Coffee EE/CS 3-176 W3.5-9.07 3:00p-3:50p Algebraic statistics and the analysis of contingency tables: Old wine in new bottles? Stephen E. Fienberg (Carnegie-Mellon University) EE/CS 3-180 W3.5-9.07 4:00p-4:30p Second Chances and closing remarks. W3.5-9.07

## Monday, March 12

 11:15a-12:15p Learning seminar on algebraic statistics Lind Hall 409 LSAS

## Tuesday, March 13

 11:15a-12:15p Laura Matusevich (Texas A&M) IMA postdoc seminar: Combinatorics of binomial primary decomposition Lind Hall 215 PS

## Wednesday, March 14

 11:15a-12:15p Algebraic geometry and applications seminar: Counting monomials Mordechai Katzman (University of Sheffield) EE/CS 3-180 AGS

## Thursday, March 15

 9:45a-10:45a Learning seminar on algebraic statistics Lind Hall 409 LSAS

## Friday, March 16

 All Day University of Minnesota Floating Holiday. The IMA is closed.

## Monday, March 19

 11:15a-12:15p Learning seminar on algebraic statistics Lind Hall 409 LSAS

## Tuesday, March 20

 11:15a-12:15p IMA postdoc seminar: Janet's algorithm for modules over polynomial rings Daniel Robertz (RWTH Aachen) Lind Hall 215 PS

## Wednesday, March 21

 11:15a-12:15p Algebraic geometry and applications seminar: Symmetries in SDP-based relaxations for constrained polynomial optimization Thorsten Theobald (Johann Wolfgang Goethe-Universität Frankfurt) EE/CS 3-180 AGS

## Thursday, March 22

 9:45a-10:45a Learning seminar on algebraic statistics Lind Hall 409 LSAS 11:15a-12:15p Real algebraic geometry tutorial: Real rings (continued) Kenneth R. Driessel (Iowa State University) Lind Hall 409 RAG

## Friday, March 23

 1:25p-2:25p IMA/MCIM Industrial problems seminar: Optical synthetic aperture imaging Joe Buck (Lockheed Martin Coherent Technologies ) Vincent Hall 1 IPS

## Monday, March 26

 11:15a-12:15p Learning seminar on algebraic statistics Lind Hall 409 LSAS

## Tuesday, March 27

 11:15a-12:15p IMA postdoc seminar: The minimum number of set-theoretic defining equations of algebraic varieties Gennady Lyubeznik (University of Minnesota Twin Cities) Lind Hall 215 PS

## Wednesday, March 28

 11:15a-12:15p Algebraic geometry and applications seminar: Algebraic geometry of Gaussian Bayesian networks Seth Sullivant (Harvard University) EE/CS 3-180 AGS

## Thursday, March 29

 9:45a-10:45a Learning seminar on algebraic statistics Lind Hall 409 LSAS 11:15a-12:15p Real algebraic geometry tutorial: Real rings (continued) Kenneth R. Driessel (Iowa State University) Lind Hall 409 RAG

## Friday, March 30

 1:25p-2:25p IMA/MCIM Industrial problems seminar: Downhole analysis of hydrocarbons Mariya Ponomarenko (Schlumberger-Doll Research) Vincent Hall 1 IPS
Abstracts
 Laura Matusevich (Texas A&M) IMA postdoc seminar: Combinatorics of binomial primary decomposition Abstract: Using examples, I will illustrate the main elements needed to explicitly describe the primary components of a binomial ideal, emphasizing the connections to combinatorics and (hypergeometric) differential equations. This is joint work with Alicia Dickenstein and Ezra Miller. Elizabeth S. Allman (University of Alaska) Phylogenetic models: algebra and evolution Abstract: Molecular phylogenetics is concerned with inferring evolutionary relationships (phylogenetic trees) from biological sequences (such as aligned DNA sequences for a gene shared by a collection of species). The probabilistic models of sequence evolution that underly statistical approaches in this field exhibit a rich algebraic structure. After an introduction to the inference problem and phylogenetic models, this talk will survey some of the highlights of current algebraic understanding. Results on the important statistical issue of identifiability of phylogenetic models will be emphasized, as the algebraic viewpoint has been crucial to obtaining such results. Niko Beerenwinkel (Harvard University) Gene interactions and the geometry of fitness landscapes Abstract: The relationship between the shape of a fitness landscape and the underlying gene interactions, or epistasis, has been extensively studied in the two-locus case. Epistasis has been linked to biological important properties such as the advantage of sex. Gene interactions among multiple loci are usually reduced to two-way interactions. Here, we present a geometric theory of shapes of fitness landscapes for multiple loci. We investigate the dynamics of evolving populations on fitness landscapes and the predictive power of the geometric shape for the speed of adaptation. Finally, we discuss applications to fitness data from viruses and bacteria. Joe Buck (Lockheed Martin Coherent Technologies ) IMA/MCIM Industrial problems seminar: Optical synthetic aperture imaging Abstract: The spatial resolution of a conventional imaging ladar system is constrained by the diffraction limit of the telescope's aperture. At Lockheed Martin Coherent Technologies (LMCT) we are implementing techniques known as synthetic-aperture imaging laser radar (SAIL), which employs aperture synthesis with coherent laser radar to overcome the diffraction limit and achieve fine-resolution, long-range, two-dimensional imaging with modest aperture diameters. I will discuss the results of my experiments while at The Aerospace Corporation which represent the first optical synthetic aperture images of a fixed, diffusely scattering target with moving aperture, as well as the current research program being developed at LMCT. Marta Casanellas (Polytechnical University of Cataluña (Barcelona)) Using algebraic geometry for phylogenetic reconstruction Abstract: Many statistical models of evolution can be viewed as algebraic varieties. The generators of the ideal associated to a model and a phylogenetic tree are called invariants. The invariants of an statistical model of evolution should allow to determine what is the tree formed by a set of living species. We will present a method of phylogenetic inference based on invariants and we will discuss why algebraic geometry should be considered as a powerful tool for phylogenetic reconstruction. The performance of the method has been studied for quartet trees and the Kimura 3-parameter model and it will be compared to widely known phylogenetic reconstruction methods such as Maximum likelihood estimate and Neighbor-Joining. Gheorghe Craciun (University of Wisconsin) Stability and instability in polynomial equations arising from complex chemical reaction networks: some underlying mathematics Abstract: Chemical reaction network models give rise to polynomial dynamical systems that are usually high dimensional, nonlinear, and have many unknown parameters. Due to the presence of these unknown parameters (such as reaction rate constants) direct numerical simulation of the chemical dynamics is practically impossible. On the other hand, we will show that important properties of these systems are determined only by the network structure, and do not depend on the unknown parameters. Also, we will show how some of these results can be generalized to systems of polynomial equations that are not necessarily derived from chemical kinetics. In particular, we will point out connections with classical problems in algebraic geometry, such as the real Jacobian conjecture. This talk describes joint work with Martin Feinberg, and can be regarded as a continuation of his earlier talk. Kenneth R. Driessel (Iowa State University) Real algebraic geometry tutorial: Real rings (continued) Abstract: Recall that we started to talk about real rings on March 1. The topic was new for us at that time. Also recall that a commutative ring with identity is "real" if the only representation of zero as the sum of squares is the trivial one. Since we did not meet during the last two weeks, I shall include an substantial review as part of our discussion. I shall mainly follow the material in the chapter "Real Rings" in the book "Positive Polynomials" by Prestel and Delzell. Our objective will be a proof of an absrtact version of the real Nullstellensatz. Mathias Drton (University of Chicago) Multiple solutions to the likelihood equations in the Behrens-Fisher problem Abstract: The Behrens-Fisher problem concerns testing the statistical hypothesis of equality of the means of two normal populations with possibly different variances. This problem furnishes one of the simplest statistical models for which the likelihood equations may have more than one real solution. In fact, with probability one, the equations have either one or three real solutions. Using the cubic discriminant, we study the large-sample probability of one versus three solutions. Nicholas Eriksson (Stanford University) Metric learning for phylogenetic invariants Abstract: We introduce new methods for phylogenetic tree construction by using machine learning to optimize the power of phylogenetic invariants. Phylogenetic invariants are polynomials in the joint probabilities which vanish under a model of evolution on a phylogenetic tree. We give algorithms for selecting a good set of invariants and for learning a metric on this set of invariants which optimally distinguishes the different models. Our learning algorithms involve semidefinite programming on data simulated over a wide range of parameters. Simulations on trees with four leaves under the Jukes-Cantor and Kimura 3-parameter models show that our method improves on other uses of invariants and is competitive with neighbor-joining. Our main biological result is that the trained invariants can perform substantially better than neighbor joining on quartet trees with short interior edges. This is joint work with Yuan Yao (Stanford). Martin Feinberg (Ohio State University) Stability and instability in polynomial equations arising from complex chemical reaction networks: the big picture Abstract: In nature there are millions of distinct networks of chemical reactions that might present themselves for study at one time or another. Written at the level of elementary reactions taken with classical mass action kinetics, each new network gives rise to its own (usually large) system of polynomial equations for the species concentrations. In this way, chemistry presents a huge and bewildering array of polynomial systems, each determined in a precise way by the underlying network up to parameter values (e.g., rate constants). Polynomial systems in general, even simple ones, are known to be rich sources of interesting and sometimes wild dynamical behavior. It would appear, then, that chemistry too should be a rich source of dynamical exotica. Yet there is a remarkable amount of stability in chemistry. Indeed, chemists and chemical engineers generally expect homogeneous isothermal reactors, even complex ones, to admit precisely one (globally attractive) equilibrium. Although this tacit doctrine is supported by a long observational record, there are certainly instances of homogeneous isothermal reactors that give rise, for example, to multiple equilibria. The vast landscape of chemical reaction networks, then, appears to have wide regions of intrinsic stability (regardless of parameter values) punctuated by far smaller regions in which instability might be extant (for at least certain parameter values). In this talk, I will present some recent joint work with Gheorghe Craciun that goes a long way toward explaining this landscape — in particular, toward explaining how biological chemistry "escapes" the stability doctrine to (literally) "make life interesting." A subsequent talk by Craciun will emphasize more mathematical detail. Stephen E. Fienberg (Carnegie-Mellon University) Algebraic statistics and the analysis of contingency tables: Old wine in new bottles? Abstract: The past decade has seen considerable interest in the reformulation of statistical models and methods for the analysis of contingency tables using the language and results of algebraic and polyhedral geometry. But as algebraic statistics has developed, new ideas have emerged that have changed how we view a number of statistical problems. This talk reviews some of these recent advances and suggests some challenges for collaborative research, especially those involving large scale databases. Robert M. Fossum (University of Illinois at Urbana-Champaign) Subspace arrangements in theory and practice Abstract: A subspace arrangement is a union of a finite number of subspaces of a vector space. We will discuss the importance of subspace arrangements first as mathematical objects and now as a popular class of models for engineering. We will then introduce some of new theoretical results that were motivated from practice. Using these results we will address the computational issue about how to extract subspace arrangements from noisy or corrupted data. Finally we will turn to the importance of subspace arrangements by briefly discussing the connections to sparse representations, manifold learning, etc... Martin Golubitsky (University of Houston) Math Matters - IMA Public Lecture: Patterns Patterns Everywhere Abstract: Regular patterns appear all around us: from vast geological formations to the ripples in a vibrating coffee cup, from the gaits of trotting horses to tongues of flames, and even in visual hallucinations. The mathematical notion of symmetry is a key to understanding how and why these patterns form. In this lecture Professor Golubitsky will show some of these fascinating patterns and explain how mathematical symmetry enters the picture. Mordechai Katzman (University of Sheffield) Algebraic geometry and applications seminar: Counting monomials Abstract: The contents of this elementary talk grew out of my need to explain to non-mathematicians what I do for a living. I will pose (and solve) two old chessboard enumeration problems and a new problem. We will solve these by counting certain monomials, and this will naturally lead us to the notion of Hilbert functions. With these examples in mind, we will try and understand the simplest of monomial ideals, namely, edge ideals, and discover that these are not simple at all! On the way we will discover a new numerical invariant of forests. Markus Kirkilionis (University of Warwick) Modelling with mass-action kinetics and beyond Abstract: Mass-action kinetics is a powerful tool to describe events created by collission of molecules or individuals in a well-mixed environment giving them locally the same probability to meet each other. Moreover this probability is only dependent on the concentration of the mutual partners. Mass action systems can be found in chemistry, cell biology, but also game theory and economics. Mathematically this gives rise to dynamical systems of a special type, more specific of polynomial type. I will give an overview how this property can be used to determine different types of bifurcations, for example the ocurrence of bistability, or oscillations via a Hopf bifurcation. All tools will be borrowing methods from algebraic geometry. Finally I will give an outlook what usually goes wrong in the modelling part while using mass-action kinetics if biochemical or cellular molecular events are considered. Finally the talk ends with a fresh look on mass-action kinetics applied to a spatial setting. Reinhard Laubenbacher (Virginia Polytechnic Institute and State University) Polynomial dynamical systems over finite fields, with applications to modeling and simulation of biological networks Abstract: Time-discrete dynamical systems with a finite state space have been used as models of biological systems since the use/invention of cellular automata by von Neumann in his attempt to model a self-replicating organism in the 1950s. More recently, they have appeared as models of a variety of biological systems, from gene regulatory networks to large-scale epidemiological networks. This talk will focus on theoretical and computational results about polynomial dynamical systems using tools from computational algebra and algebraic geometry. Gennady Lyubeznik (University of Minnesota Twin Cities) IMA postdoc seminar: The minimum number of set-theoretic defining equations of algebraic varieties Abstract: Given an algebraic variety in affine or projective space, what is the minimum number of equations that define this variety set-theoretically? This is a very difficult problem in general; no algorithm to compute this minimum number is currently known. We will discuss some techniques for upper and lower bounds and consider some interesting specific examples. František Matúš (Institute of Information Theory and Automation) Generalized maximum likelihood estimates for exponential families Abstract: Exponential families underpin numerous models of statistics and information geometry that have significant applications. For a standard full exponential family, or its canonically convex subfamily, if the corresponding likelihood function from a sample has a maximizer t* then, by the maximum likelihood principle, the data are judged to be generated by the probability measure P* from the family that is parameterized by t*. Since the likelihood depends on data only through their mean, in this way the mean is mapped to P*. In a joint work with Imre Csiszar, Budapest, we study an extension of this mapping, the generalized maximum likelihood estimator. It assigns to each point of the space at which the likelihood function is bounded above, a probability measure from the closure of the family in variation distance. A detailed description, complete characterization of domain and range, and additional results will be presented, not imposing any regularity assumptions. Jason Morton (University of California) Geometry of rank tests Abstract: We investigate the polyhedral geometry of conditional probability and undirected graphical models, developing new statistical procedures called convex rank tests. The polytope associated to an undirected graphical conditional independence model is the graph associahedron. The convex rank test defined by the dual semigraphoid to the n-cycle graphical model is applied to microarray data analysis to detect periodic gene expression. Sarah Olson (North Carolina State University) Supervised learning artifical neural network algorithms for optimizing mechanical properties of elastin-like polypeptide hydrogels for cartilage repair Abstract: Joint work with Dana L. Nettles3, Kimberly Trabbic Carlson3, Ashutosh Chilkoti3, Lori A. Setton3,4, Mansoor A. Haider1,2 Elastin-like polypeptide (ELP) hydrogels are a class of biomaterials that have potential utility as a biocompatible scaffold for filling defects due to osteoarthritis and for regenerating cartilage. Because of the facility to genetically engineer elastin sequence, there are almost endless possible configurations of ELPs and conformations of the networks formed after crosslinking. ELP biomaterial function will exhibit a complex dependence on these polymer characteristics that impacts properties expected to affect cartilage regeneration, such as mechanical load support. These complex structure-function relationships for crosslinked ELP hydrogels are not well described. A method for predicting the mechanical properties of ELP hydrogels was developed based on structural properties and Supervised Artificial Neural Network (ANN) modeling. The ANN Model used concentration, molecular weight, crosslink density, and sample number to predict the dynamic shear modulus and loss angle of the hydrogels. The ANN was implemented in a custom compiled code based on the Scaled Conjugate Gradient minimization algorithm and a Monte Carlo Method was used to expand the dataset. The ANN was trained using a varying subsets of the full dataset (22 formulations), with the complementary subset used for validation. Trained networks demonstrated excellent accuracy in prediction of hydrogel dynamic shear modulus at physiological temperature, based on polymer design and predictions were robust with respect to statistical variations. The results are used to show the validity of an intermediate screening process using ANNs to obtain the optimal mechanical properties for the ELP. 1 Biomathematics Graduate Program, North Carolina State University, 2 Department of Mathematics, North Carolina State University, 3 Department of Biomedical Engineering, Duke University, 4 Department of Surgery, Duke Medical Center Lior Pachter (University of California) From Drosophila and transposable elements to phylogenetic networks and associahedra Abstract: We begin with an overview of the Drosophila genome project, whose goal is the sequencing and comparison of 12 fruit fly genomes. In particular, we describe some of the dynamic behavior of transposable elements. These are self-replicating sequences that play a major role in shaping the structure and function of genomes. Our methods for studying transposable elements lead naturally to the analysis of split systems and their associated phylogenetic networks. We explain why the tessellation of \$\overline{M}_0^n\$ by associahedra is a natural candidate for the space of phylogenetic networks, and explain the relevance of this observation to the analysis of the popular neighbor-net algorithm used for studying split systems. We discuss various aspects of the neighbor-net algorithm, including its interpretation as a greedy algorithm for the traveling-salesman problem, how to obtain statistically meaningful parameters, and how to interpret its output. The application of neighbor-net to the split system we obtain from transposable elements in Drosophila reveals interesting insights about a set of species that may have undergone lineage sorting. This is joint work with Anat Caspi and Dan Levy. Sonja Petrovic (University of Kentucky) Toric ideals of phylogenetic invariants for the general group-based model on claw trees Abstract: We address the problem of studying the toric ideals of phylogenetic invariants for a general group-based model on an arbitrary claw tree. We focus on the group 2 and choose a natural recursive approach that extends to other groups. The study of the lattice associated with each phylogenetic ideal produces a list of circuits that generate the corresponding lattice basis ideal. In addition, we describe explicitly a quadratic lexicographic Gröbner basis of the toric ideal of invariants for the claw tree on an arbitrary number of leaves. Combined with a result of Sturmfels and Sullivant, this implies that the phylogenetic ideal of every tree for the group 2 has a quadratic Gröbner basis. Hence, the coordinate ring of the toric variety is a Koszul algebra. This is joint work with Julia Chifman, University of Kentucky. Giovanni Pistone (Politecnico di Torino) Information geometry and algebraic statistics Abstract: Recent presentations of Information Geometry (IG), e.g. Amari and Nagaoka (2000), consider general statistical models and general sample spaces. However, the seminal discussion by Cenkov (transl. 1982) is based on finite sample spaces, as it is in Algebraic Statistics (AS). This talk will first review basic IG from the point of view of AS. In the second part, it discusses the issue of computation IG quantities and presents a few examples. Mariya Ponomarenko (Schlumberger-Doll Research) IMA/MCIM Industrial problems seminar: Downhole analysis of hydrocarbons Abstract: Quick and accurate estimation of the composition of the hydrocarbon fluid in the formation is essential in assessing an oil reservoir value and determining optimal production strategies. This task is complicated by contamination from oil- and synthetic-based drilling mud filtrates. In this talk we will describe the visible - near-infrared spectroscopy technique to estimate the composition of formation fluid and level of contamination from the downhole optical absorption spectroscopy measurements. Thomas S. Richardson (University of Washington) Gaussian path diagrams Abstract: In the 1920's the geneticist Sewall Wright introduced a class of Gaussian statistical models represented by graphs containing directed and bi-directed edges, known as path diagrams. These models have been used extensively in psychometrics and econometrics where they are called structural equation models. I will first describe the subclass of bow-free acyclic path diagrams, which have desirable statistical properties. I will then characterize a subclass of models that are characterized by their Markov properties. Lastly I will outline recent work aimed at characterizing non-Markovian constraints that may arise. (This is joint work with Mathias Drton, Michael Eichler and Masashi Miyamura.) Daniel Robertz (RWTH Aachen) IMA postdoc seminar: Janet's algorithm for modules over polynomial rings Abstract: This talk gives an introduction to Janet bases. Originally developed for the algebraic analysis of systems of partial differential equations in the beginning of the 20th century, the algorithm by Maurice Janet is today an efficient alternative for Buchberger's algorithm to compute Gröbner bases of modules over polynomial rings. In this talk we give a modern description of Janet's algorithm and explain nice combinatorial properties of the resulting Janet bases: separation of the variables into multiplicative and non-multiplicative ones for each Janet basis element allows to read off vector space bases for both the submodule and the residue class module. As a consequence, the Hilbert series and polynomial of a (graded) module as well as a free resolution are easily obtained from the Janet basis. If time admits, some modifications of Janet's algorithm will be addressed which allow to work with polynomial rings over the integers instead of a field resp. generalize the algorithm to certain classes of non-commutative polynomial rings. Aleksandra B. Slavković (Pennsylvania State University) Application of algebraic statistics for statistical disclosure limitation Abstract: Statistical disclosure limitation applies statistical tools to the problems of limiting sensitive information releases about individuals and groups that are part of statistical databases while allowing for proper statistical inference. The limited releases can be in a form of arbitrary collections of marginal and conditional distributions, and odds ratios for contingency tables. Given this information, we discuss how tools from algebraic geometry can be used to give both complete and incomplete characterization of discrete distributions for contingency tables. These problems also lead to linear and non-linear integer optimization formulations. We discuss some practical implication, and challenges, of using algebraic statistics for data privacy and confidentiality problems. Bernd Sturmfels (University of California) Open problems in algebraic statistics Abstract: This talk introduces five or six mathematical problems whose solution would likely be a significant contribution to the emerging interactions between algebraic geometry, statistics, and computational biology. Seth Sullivant (Harvard University) Algebraic geometry and applications seminar: Algebraic geometry of Gaussian Bayesian networks Abstract: Conditional independence models for Gaussian random variables are algebraic varieties in the cone of positive definite matrices. We explore the geometry of these varieties in the case of Bayesian networks, with a view towards generalizing the recursive factorization theorem. When some of the random variables are hidden, non-independence constraints are need to describe the Bayesian networks. These non-independence constraints have potential inferential uses for studying collections of random variables. In the case that the underlying network is a tree, we give a complete description of the defining constraints of the model and show a surprising connection to the Grassmannian. Thorsten Theobald (Johann Wolfgang Goethe-Universität Frankfurt) Algebraic geometry and applications seminar: Symmetries in SDP-based relaxations for constrained polynomial optimization Abstract: We consider the issue of exploiting symmetries in the hierarchy of semidefinite programming relaxations recently introduced in polynomial optimization. After providing the necessary background we focus on problems where either the symmetric or the cyclic group is acting on the variables and extend the representation-theoretical methods of Gatermann and Parrilo to constrained polynomial optimization problems. Moreover, we also propose methods to efficiently compute lower and upper bounds for the subclass of problems where the objective function and the constraints are described in terms of power sums. (Joint work with L. Jansson, J.B. Lasserre and C. Riener) Jaroslaw Wisniewski (University of Warsaw) On phylogenetic trees – a geometer's view Abstract: I will discuss geometric methods of investigating phylogenetic trees. In a joint project with Weronika Buczynska we investigate projective varieties which are binary symmetric models of trivalent phylogenetic trees. They have Gorenstein terminal singularities and are Fano varieties. Moreover any two such varieties which are of the same dimension are deformation equivalent, that is, they are in the same connected component of the Hilbert scheme of the projective space whose coordinates are indexed by subsets of their leaves. Ruriko Yoshida (University of Kentucky) A combinatorial test for significant codivergence between cool-season grasses and their symbiotic fungal endophytes Abstract: Symbioses of grasses and fungal endophytes constitute an interesting model for evolution of mutualism and parasitism. Grasses of all subfamilies can harbor systemic infections by fungi of the family Clavicipitaceae. Subfamily Poöideae is specifically associated with epichloë endophytes (species of Epichloë and their asexual derivatives, the Neotyphodium species) in intimate symbioses often characterized by highly efficient vertical transmission in seeds, and bioprotective benefits conferred by the symbionts to their hosts. These remarkable symbioses have been identified in most grass tribes spanning the taxonomic range of the subfamily. Here we examine the possibility of codivergence in the phylogenetic histories of Poöideae and epichloë. We introduce a method of analysis to detect significant codivergence even in the absence of strict cospeciation, and to address problems in previously developed methods. Relative ages of corresponding cladogenesis events were determined from ultrametric maximum likelihood H (host) and P (parasite = symbiont) trees by an algorithm called MRCALink (most recent common ancestor link), an improvement over previous methods that greatly weight deep over shallow H and P node pairs. We then compared the inferred correspondence of MRCA ages in the H and P trees to the spaces of trees estimated from 10,000 randomly generated H and P tree pairs. Analysis of the complete dataset, which included a broad host-range species and some likely host transfers (jumps), did not indicate significant codivergence. However, when likely host jumps were removed the analysis indicated highly significant codivergence. Interestingly, early cladogenesis events in the Poöideae corresponded to early cladogenesis events in epichloë, suggesting concomitant origins of the Poöideae and this unusual symbiotic system. This is joint work with C. L. Schardl, K. D. Craven, A. Lindstrom, and A. Stromberg. Debbie Yuster (Columbia University) Classifying disease models using regular polyhedral subdivisions Abstract: Genes play a complicated role in how likely one is to get a certain disease. Biologists would like to model how one's genotype affects their likelihood of illness. We propose a new classification of two-locus disease models, where each model corresponds to an induced subdivision of a point configuration (basically a picture of connected dots). Our models reflect epistasis, or gene interaction. This work is joint with Ingileif Hallgrimsdottir. For more information, see our preprint at arXiv:q-bio.QM/0612044. Yi Zhou (Carnegie-Mellon University) Maximum likelihood estimation in latent class Abstract: Latent class models have been used to explain the heterogeneity of the observed relationship among a set of categorical variables and have received more and more attention as a powerful methodology for analyzing discrete data. The central goal of our work is to study the existence and computation of maximum likelihood estimates (MLEs) for these models, which are cardinal for assessment of goodness of fit and model selection. Our study is at the interface between the fields of algebraic statistics and machine learning. Traditionally, the expectation maximization (EM) algorithm has been applied to compute the MLEs of a latent class model. However, the solutions provided by the EM correspond to local maxima only, so, although we are able to compute them effectively, we still lack methods for assessing uniqueness and existence of the MLEs. Another interesting problem in statistics is the identifiability of the model. When a model is unidentifiable, it is necessary to adjust the number of degrees of freedom in order to apply correctly goodness-of-fit tests. In our work, we show that both the existence and identifiability problems are closely related to the geometric properties of the latent class models. Therefore, studying the algebraic varieties and ideals arising from these models is particularly relevant to our problem. We include a number of examples as a way of opening a discussion on a general method for addressing both MLE existence and identifiability in latent class models.
Visitors in Residence