Detecting mixed dimensionality and density in noisy point clouds

Monday, October 27, 2008 - 10:55am - 11:45am
EE/CS 3-180
Gloria Haro Ortega (Universitat Politecnica de Catalunya)
We present a statistical model to learn mixed dimensionalities and densities present in stratifications, that is, mixture of manifolds representing different characteristics and complexities in the data set. The basic idea relies on modeling the high dimensional sample points as a process of translated Poisson mixtures, with regularizing restrictions, leading to a model which includes the presence of noise. The translated Poisson distribution is useful to model a noisy counting process, and it s derived from the noise-induced translation of a regular Poisson distribution. By maximizing the log-likelihood of the process counting the points falling into a local ball, we estimate the local dimension and density. We show that the sequence of all possible local countings in a point cloud formed by samples of a stratification can be modeled by a mixture of different Translated Poisson distributions, thus allowing the presence of mixed dimensionality and densities in the same data set. A partition of the points in different classes according to both dimensionality and density is obtained, together with an estimation of these quantities for each class.