Robust and Efficient Spectral Clustering of Large Graphs

Monday, October 24, 2011 - 1:30pm - 2:30pm
Keller 3-180
Aarti Singh (Carnegie-Mellon University)
Despite the empirical success of spectral clustering, its performance under noise and incomplete data is not well understood. This talk will provide a precise characterization of the misclustering rate of spectral clustering for large graphs. Using minimax theory, we establish information theoretic lower bounds on the amount of noise any clustering algorithm can tolerate and demonstrate that spectral clustering is near-optimal. The gap relative to the optimal performance is the statistical price that needs to be paid for computational efficiency. Using this analysis, we propose a novel spectral method which optimizes the number of edge similarities that need to be measured to guarantee consistent recovery of the clusters.

In high dimensional settings, the clusters can be very small relative to the size of the graph and are often characterized by a few relevant features. To enable simultaneous extraction of such a cluster and the relevant features, we consider a sparsity penalized spectral biclustering method and compare its performance to minimax limits.
MSC Code: