Talk
Abstract:
Graphical Models and their Applicability to the Speech Recognition
Problem
Jeff A. Bilmes
Department of Electrical Engineering
University of Washington
bilmes@ee.washington.edu
Graphical
models (GMs) are a flexible statistical abstraction and have
been used for a variety of problems such as medical diagnosis,
decision theory, and time series prediction. GMs moreover appear
to be a generalization of many techniques used in statistical
analysis and signal processing such as Kalman filters, auto-regressive
models, and some coding algorithms (such as turbo coding).
This talk will first provide an overview of GMs, covering the
four main GM components (the semantics, structure, implementation,
and parameters), GM properties, GM inference, and learning GMs.
Next, it will be shown that many common methods for automatic
speech recognition (ASR) systems are instances of GMs and their
associated algorithms. This includes PCA, LDA, QDA/HDA, Factor
analysis, Gaussians, MLPs, Mixture models, and HMMs. The GM
view of HMMs, for example, suggests that HMMs are a more capable
model for ASR than is sometimes believed. Next, it will be argued
that when GMs are used for the classification task (e.g., ASR),
they can be formed so that only the distinct attributes of objects
are represented. This has been called structural discriminability.
Relative to normal generative models, such models have the promise
to improve parsimony (i.e., have smaller memory and compute
demands), be less sensitive to noise, and improve recognition
accuracy. GMs therefore provide an excellent formalism within
which to strive for new ASR models.
Material
from Talk pdf
Mathematical
Foundations of Speech Processing and Recognition
2000-2001
Program: Mathematics in Multimedia
|