Lossy Compression, Classification, and Regression

Monday, November 11, 1996 - 11:00am - 12:00pm
Keller 3-180
Robert Gray (Stanford University)
The theory and design of lossy compression systems share many ideas and techniques with statistical classification and regression and hence also with image segmentation. These similarities motivate incorporating a Bayes risk term into a Shannon source coding formulation in order to model a system combining quantization with either classification (detection) or regression (estimation). This provides some new (and old) algorithms for compression, classification, and regression individually, but more interestingly it provides an approach to the joint optimization of systems involving both compression and classification. Examples include the compression of digital mammograms with built in highlighting of microcalcifications and the compression of image data while simultaneously segmenting the image into different local types for separate rendering or printing. The design of such codes involves ideas from clustering and tree-structured statistical methods and it leads to issues involving the combination of quantization, probability density or mass estimation, and classification and regression. The resulting codes have as extreme points universal source codes and classified vector quantizers. This talk will survey the basic ideas, illustrate them with examples, and describe some of the algorithms under current study along with several conjectures about their asymptotic performance.