Supporting Statistical Hypothesis Testing over Graphs

Thursday, March 1, 2012 - 2:00pm - 2:45pm
Keller 3-180
Jennifer Neville (Purdue University)
Recently there has been a growing interest in representing data from complex systems as graphs and analyzing the network structure to understand key patterns/dependencies in the underlying system. This has fueled a large body of research on both models of network structure and algorithms to automatically discover patterns (e.g., communities) in the structure. However, robust statistical models, which can accurately represent distributions over graph populations, are critical to to accurately assess the significance of discovered patterns or to distinguish between alternative models. Specifically, since sampling distributions (either analytical or empirical) can be used to determine the likelihood of a given sample, statistical models facilitate hypothesis testing and anomaly detection (e.g., graphs with low likelihood can be flagged as anomalous). However, unlike metric spaces the space of graphs exhibits a combinatorial structure which poses significant theoretical and prac
tical challenges which need to be overcome for accurate estimation and efficient inference. In this talk, I will discuss recent work in which we investigated the distributional properties of state-of-the art generative models and sampling algorithms for graphs, and discuss the implications of the findings for network hypothesis testing and anomaly detection.
MSC Code: