Graphical Models and Applicability to the Speech Recognition Problem

Thursday, September 21, 2000 - 9:30am - 10:25am
Keller 3-180
Jeff Bilmes (University of Washington)
Graphical models (GMs) are a flexible statistical abstraction and have been used for a variety of problems such as medical diagnosis, decision theory, and time series prediction. GMs moreover appear to be a generalization of many techniques used in statistical analysis and signal processing such as Kalman filters, auto-regressive models, and some coding algorithms (such as turbo coding).

This talk will first provide an overview of GMs, covering the four main GM components (the semantics, structure, implementation, and parameters), GM properties, GM inference, and learning GMs. Next, it will be shown that many common methods for automatic speech recognition (ASR) systems are instances of GMs and their associated algorithms. This includes PCA, LDA, QDA/HDA, Factor analysis, Gaussians, MLPs, Mixture models, and HMMs. The GM view of HMMs, for example, suggests that HMMs are a more capable model for ASR than is sometimes believed. Next, it will be argued that when GMs are used for the classification task (e.g., ASR), they can be formed so that only the distinct attributes of objects are represented. This has been called structural discriminability. Relative to normal generative models, such models have the promise to improve parsimony (i.e., have smaller memory and compute demands), be less sensitive to noise, and improve recognition accuracy. GMs therefore provide an excellent formalism within which to strive for new ASR models.