Impact of Sensing Structure in Classification of High-Dimensional Medical Informatics Data

Tuesday, November 15, 2011 - 3:15pm - 4:15pm
Keller 3-180
W. Clem Karl (Boston University)
There has been an explosion of non-invasive biomedical sensing
modalities that have revolutionized our ability to probe the
biomedical world. Often decisions have to be made on the basis of
these increasingly high-dimensional observations. An example would be
the determination of cancer or stroke from indirect tomographic
projection measurements. The problem is frequently exacerbated by the
lack of labeled training samples from which to learn class models. In
many cases, however, there exists a latent low-dimensional sensing
structure that can potentially be exploited for inferencing aims.
This work investigates the impact of latent sensing structure on
supervised classification performance when the data dimension scales
to infinity faster than the number of samples. In contrast to some
existing studies, here the classification difficulty is held fixed and
finite as the data dimension scales. For a binary supervised
classification problem with Gaussian likelihood functions, it is shown
that the asymptotic error probability converges to that of pure
guessing if the sensing structure is totally ignored, whereas it
converges to the Bayes risk if the sensing structure is sufficiently
regular and the classification method is sensing aware. It is also
shown, however, that without suitable regularity in the latent
low-dimensional sensing structure, it is impossible to attain
nontrivial asymptotic error probability. These findings are validated
through various simulations. Additional numerical results for support
vector machines and sensitivity to mismatch between true and assumed
structure are also provided.
MSC Code: