# Poster Session and Reception

Monday, August 15, 2016 - 5:30pm - 6:30pm

Lind 400

**Persistent Homology for Pan-Genome Analysis**

Brittany Terese Fasy (Montana State University)

Single Nucleotide Polymorphisms (SNPs), Insertions and Deletions (INDELs), and Structural Variations (SVs) are the basis of genetic variation among individuals and populations. Second and third generation high throughput-sequencing technologies have fundamentally changed our biological interpretation of genomes and notably have transformed analysis and characterization of the genome-wide variations present in a population or a particular environment. As a result of this revolution in the next generation sequencing technologies we now have a large volume of genome sequences of species that represent major phylogenetic clades. Having multiple, independent genomic assemblies from a species presents the opportunity to move away from a single reference per species, incorporating information from species across the phylogenomic range of the species into a pan-genomic reference that can better organize and query the underlying genetic variation. Tools have started to explore multiple genomes in bioinformatics analyses. Several tools have evolved to take advantage of information from multiple, closely related genomes (species, strains/lines) to perform bioinformatics analyses such as variant detection without the bias introduced from using a single reference. In this work, we address challenges and opportunities that arise from pan-genomics using graphical data structures. We consider the problem of computing the persistence of structures representing genomic variation from a graph/path data set. The particular application we are interested in is mining pan-genomic data sets.**Grid Presentation for Heegaard Floer Homology of a Double Branched Cover**

Sayonita Ghosh Hajra (Hamline University)

Heegaard Floer homology is an invariant of a closed oriented 3-manifold. Because of its complex nature these homology groups are difficult to compute. Stipsicz gave a combinatorial description of a version of Heegaard Floer homology for a double branched cover of S^3. In this poster, we describe the algorithm and present a code. As an example, we compute

this homology group for a double branched cover of S^3 branched along the unknot.**Topological Data Analysis at PNNL**

Emilie Purvine (Battelle Pacific Northwest National Laboratory)

Over the past three years the Pacific Northwest National Laboratory has been growing a portfolio in Topological Data Analysis and Modeling. This poster will lay out our portfolio in the area with the hopes of informing the community of our platform and building collaborations. Our current projects include:

-The use of persistent homology and HodgeRank to discover anomalies in time-evolving systems. For PH we form point clouds using statistics from a dynamic graph and look at when the barcodes of these point clouds differ significantly from that of a baseline point cloud. We use Wasserstein distance and other dissimilarities based on interval statistics. HodgeRank is used to discover rankings of sources and sinks in a directed graph. As the graph evolves these rankings may change and we consider anomalies to be when any node's rank differs isignificantly from its baseline rank. In particular we use these techniques to find attacks and instabilities in cyber systems.

-Sheaf theory for use in information integration. We model groups of interacting sensors as a topological space. The data that is returned by the sensors serves as the stalk space to define a sheaf. The cohomology of the sheaf identifies global sections - where all sensors are in agreement - and identifying loops in the base space may inform when some sensors need to be retasked. Included in this work is the measurement of local sections where sensors are partially in agreement, representation of uncertainty by relaxing sectional equality to produce approximate sections, and use of category theory to cast all stalks into vector spaces so that integration is more easily defined.

-A computational effort towards a robust scalable software suite for computational topology. We have found useful software in the community, but typically only one piece at a time, e.g. persistence is separate from sheaf theory which is separate from general homology. We hope to precipitate a community effort towards development of a suite of topological software tools which can be applied to small and large data sets alike.**Median Shapes**

Altansuren Tumurbaatar (Washington State University)

We introduce new ideas for the average of a set of general shapes, which we represent as currents. Using the flat norm to measure the distance between currents, we present a mean and a median shape. In the setting of a finite simplicial complex, we demonstrate that the median shape can be found efficiently by solving a linear program.**Burn Time of a Medial Axis and its Applications**

Erin Chambers (St. Louis University)

The medial axis plays a fundamental role in many shape matching and analysis, but is widely known to be unstable to even small boundary perturbations. Significance measures to analyze and prune the medial axis of 2D and 3D shapes are well studied, but the majority of them in 3D are locally defined and are unable to maintain global topological properties when used for pruning. We introduce a global significance measure called the Burn Time, which generalizes the extended distance function (EDF) introduced in prior work. Using this function, we are able to generalize the classical notion of erosion thickness measure over the medial axes of 2D shapes. We demonstrate the utility of these shape significance measures in extracting clean, shape-revealing and topology-preserving skeletons of 3D shapes, and discuss future directions and applications of this work.

This is based on joint work with Tao Ju, David Letscher, Kyle Sykes, and Yajie Yan, which appeared in SIGGRAPH 2016.**Homology of Generalized Generalized Configuration Spaces**

Radmila Sazdanović (North Carolina State University)

The configuration space of n distinct points in a manifold is a well-studied object with lots of applications. Eastwood and Huggett relate homology of so-called graph configuration spaces to the chromatic polynomial of graphs. We describe a generalization of this approach from graphs to simplicial complexes. This construction yields, for each pair of a simplicial complex and a manifold, a simplicial chromatic polynomial that satisfies a version of deletion-contraction formula.

This is a joint work with A. Cooper and V. de Silva.**Analysis and Visualization of ALMA Data Cubes**

Bei Wang (The University of Utah)

The availability of large data cubes produced by radio telescopes like the VLA and ALMA is leading to new data analysis challenges as current visualization tools are ill-prepared for the size and complexity of this data. Our project addresses this problem by using the notion of a contour tree from topological data analysis (TDA). The contour tree provides a mathematically robust technique with fine grain controls for reducing complexity and removing noise from data. Furthermore, to support scientific discovery, new visualizations are being designed to investigate these data and communicate their structures in a salient way: a process that relies on the direct input of astronomers.

Joint work with Paul Rosen (USF), Anil Seth (Utah Astronomy), Jeff Kern (NRAO), Betsy Mills (NRAO) and Chris Johnson

(Utah)**Rips Filtrations for Quasi-metric Spaces (with Stability Results)**

Katharine Turner (École Polytechnique Fédérale de Lausanne (EPFL))

Rips filtrations over a finite metric space and their corresponding persistent homology are prominent methods in Topological Data Analysis to summarize the shape of data. Crucial to their use is the stability result that says if $X$ and $Y$ are finite metric space then the (bottleneck) distance between persistence diagrams, barcodes or persistence modules constructed by the Rips filtration is bounded by $2d_{GH}(X,Y)$ (where $d_{GH}$ is the Gromov-Hausdorff distance). Using the asymmetry of the distance function we construct four different constructions analogous to the Rips filtration that capture different information about the quasi-metric spaces. The first method is a one-parameter family of objects where, for a quasi-metric space $X$ and $a\in [0,1]$, we have a filtration of simplicial complexes $\{\mathcal{R}^a(X)_t\}_{t\in [0,\infty)}$ where $\mathcal{R}^a(X)_t$ is clique complex containing the edge $[x,y]$ whenever $a\min \{d(x,y), d(y,x) \}+ (1-a)\max \{d(x,y), d(y,x)\}\leq t$. The second method is to construct a filtration $\{\mathcal{R}^{dir}(X)_t\}$ of ordered tuple complexes where tuple $(x_0, x_2, \ldots x_p)\in \mathcal{R}^{dir}(X)_t$ if $d(x_i, x_j)\leq t$ for all $i\leq j$. Both our first two methods agree with the normal Rips filtration when applied to a metric space. The third and fourth methods use the associated filtration of directed graphs $\{D(X)_t\}$ where $x\to y$ is included in $D(X)_t$ when $d(x,y)\leq t$. Our third method builds persistence modules using the the connected components of the graphs $D(X)_t$. Our fourth method uses the directed graphs $D_t$ to create a filtration of posets (where $x\leq y$ if there is a path from $x$ to $y$) and corresponding persistence modules using poset topology.**Safer Roads Tomorrow Through Analyzing Today’s Accidents***

Maia Grudzien (Montana State University)

Bozeman Daily Chronicle quoted the city’s head engineer as stating “Even with property owners paying more to help Bozeman’s street grid keep up with growth, the rate of development is out-pacing the city’s ability to upgrade increasingly clogged intersections” the week of July 27, 2016. As infrastructure is strained by the growing population in Bozeman, the state of Montana, and nationwide, it falls quicker into disrepair. The need for more efficient roadways is creating shorter time lines of design, but the safety of the roadways must remain at a top priority. This project has been looking at understanding accident prone areas in Montana cities and towns by collecting data and mapping it throughout the region. Then areas are sorted with factors that could include density, clusters, city regions (i.e., sporting event complexes, shopping centers), etc. The goal of this project is to provide examples to engineers and city planners of safe and accident-prone roads and intersections that can be used to better build much needed infrastructure.

*This research is funded by NSF CCF grant 1618605, and the Montana State University USP program**Persistent Homology on Grassmann Manifolds for Analysis of Hyperspectral Movies**

Lori Ziegelmeier (Macalester College)

We present an application of persistent homology to the detection of chemical plumes in hyperspectral movies of Long-Wavelength Infrared data which capture the release of a quantity of chemical into the air. Regions of interest within the hyperspectral data cubes are used to produce points on the real Grassmann manifold $G(k, n)$ (whose points parameterize the k-dimensional subspaces of $\mathbb{R}^n)$, contrasting our approach with the more standard framework in Euclidean space. An advantage of this approach is that it allows a sequence of time slices in a hyperspectral movie to be collapsed to a sequence of points in such a way that some of the key structure within and between the slices is encoded by the points on the Grassmann manifold. This motivates the search for topological structure, associated with the evolution of the frames of a hyperspectral movie, within the corresponding points on the Grassmann manifold. The proposed framework affords the processing of large data sets while retaining valuable discriminative information. In this paper, we discuss how embedding our data in the Grassmann manifold, together with topological data analysis, captures dynamical events that occur as the chemical plume is released and evolves.