Talk
Abstract:
Segmental HMMs: Modelling Dynamics and Underlying Structure
for Automatic Speech Recognition
Wendy Holmes
20/20
Speech Limited, UK
w.holmes@2020speech.com
HMMs provide a tractable mathematical framework for training
and recognition, combined with a model structure that is broadly
appropriate for speech. However, it is generally acknowledged
that HMMs provide a somewhat crude model of the speech production
process. In particular, the assumptions of piecewise stationarity
and of independence are not appropriate for the continuously-evolving
dynamic nature of speech production. This talk will begin by
discussing some of the advantages and limitations of HMMs, and
explaining how these limitations can be addressed by using a
segmental HMM, in which states are associated with sequences
of observations rather than with individual observations. Different
trajectory-based models for describing signal dynamics will
be described, and some experimental investigations with a particular
class of segmental HMMs will be presented.
The second part of the talk will consider the requirements for
a good model for automatic speech recognition in more general
terms. It will be argued that such a model needs to capture
the dynamics of the underlying speech production process, and
also provide a meaningful characterization of differences between
speakers, effects of noise and so on. These requirements suggest
that the model should be expressed in terms of parameters that
are closely related to speech production, such as articulatory
or formant parameters. It should then be possible to develop
a model of speech that can be used for a disciplined approach
to speaker adaptation, as well as being directly applicable
to both synthesis and recognition. To illustrate the principles
of this integrated approach, formant-based trajectory segmental
HMMs have been applied to recognition-synthesis speech coding.
Material
from talks
Mathematical
Foundations of Speech Processing and Recognition
2000-2001
Program: Mathematics in Multimedia
|