Correlation screening from random matrices: phase transitions and Poisson limits
Monday, September 26, 2011 - 2:00pm - 3:00pm
Random matrices are measured in many areas of engineering, social science, and natural science. When the rows of the matrix are random samples of a vector of dependent variables the sample correlations between the columns of the matrix specify a correlation graph that can be used to explore the dependency structure. However, as the number of variables increases screening for highly correlated variables becomes futile due to a high dimensional phase transition phenomenon: with high probability most variables will have large sample correlations even when there is no correlation present. We will present a general theory for predicting the phase transition and provide Poisson limit theorems that can be used to predict finite sample behavior of the correlation graph. We apply our framework to the problem of hub discovery in correlation and partial correlation (concentration) graphs. We illustrate our correlation screening framework in computational biology and, in particular, for discovery of influential variables in gene regulation networks. This is joint work with Bala Rajaratnam at Stanford University.