Recognize Speech v/s Wreck a Nice Beach: The Mathematics of Automatic Speech Recognition

Monday, September 18, 2000 - 9:30am - 10:25am
Keller 3-180
Sanjeev Khudanpur (Johns Hopkins University)
From Startrek to Star Wars and through much of science fiction, seamlessness is a recurrent theme in human computer interfaces -- communicating with machines the way we communicate with other human beings. Thanks to advances in the last two decades, this vision is closer to reality than one may suspect. Yet, we are not around the corner from a day when an automated agent participates at a conference table by taking notes and digging out facts from a database in response to spoken cues. This talk focuses on the speech recognition aspect of human computer interaction.

This introductory presentation will begin with an overview of the evolution and the state of the art in automatic speech recognition. It will then illustrate the application of statistical modeling, optimization techniques and abstract algebra in transforming what was perceived as a pipe dream in the early seventies into a dictation system available today on a personal computer for $99 plus taxes. Classification and regression trees, hidden Markov models, multivariate Gaussian distributions, nonparametric estimation and finite state automata theory are but a few of the keystones in this ongoing march to success.

While it is only a matter of time before products employing speech recognition will be ubiquitous as the telephone, several challenging problems remain in this field. While the rest of the workshop will dwell in depth upon many of these problems, this presentation will serve to familiarize the mathematicians in the audience with engineering aspects of automatic speech recognition.