Clustering linear and nonlinear manifolds

Tuesday, October 28, 2008 - 12:15pm - 1:05pm
EE/CS 3-180
René Vidal (Johns Hopkins University)
Over the past few years, various techniques have been developed for learning a low-dimensional representation of data lying in a nonlinear manifold embedded in a high-dimensional space. Unfortunately, most of these techniques are limited to the analysis of a single submanifold of a Euclidean space and suffer from degeneracies when applied to linear manifolds (subspaces). The simultaneous segmentation and estimation of a collection of submanifolds from sample data points is a challenging problem that is often thought of as chicken-and-egg. Therefore, this problem is usually solved in two stages (1) data clustering and (2) model fitting, or else iteratively using, e.g. the Expectation Maximization (EM) algorithm.

The first part of this talk will show that for a wide class of segmentation problems (mixtures of subspaces, mixtures of fundamental matrices/trifocal tensors, mixtures of linear dynamical models), the chicken-and-egg dilemma can be tackled using an algebraic geometric technique called Generalized Principal Component Analysis (GPCA). In fact, it is possible to eliminate the data segmentation step algebraically and then use all the data to recover all the models without first segmenting the data. The solution can be obtained using linear algebraic techniques, and is a natural extension of classical PCA from one to multiple subspaces.

The second part of this talk will present a novel algorithm for clustering data sampled from multiple submanifolds of a Riemannian manifold, e.g. the space of probability density functions. The algorithm, called Locally Linear Manifold Clustering (LLMC) is based on clustering a low-dimensional representation of the data learned using generalizations of local nonlinear dimensionality reduction algorithms from Euclidean to Riemannian spaces.

The talk will also present a few motivating applications of manifold clustering to computer vision problems such as texture clustering, segmentation of rigid body motions, segmentation of dynamic textures, segmentation of diffusion MRI.

MSC Code: