Document Type Doctoral Thesis Author Purnell, Darryl William URN etd-10312005-142207 Document Title Discriminative and Bayesian techniques for hidden Markov model speech recognition systems Degree PhD (Electronic Engineering) Department Electrical, Electronic and Computer Engineering Supervisor
Advisor Name Title Prof E C Botha Committee Chair Keywords
- automatic speech recognition
- Bayesian adaptation
- hidden Markov model training
Date 2001-04-01 Availability unrestricted AbstractThe collection of large speech databases is not a trivial task (if done properly). It is not always possible to collect, segment and annotate large databases for every task or language. It is also often the case that there are imbalances in the databases, as a result of little data being available for a specific subset of individuals. An example of one such imbalance is the fact that there are often more male speakers than female speakers (or vice-versa). If there are, for example, far fewer female speakers than male speakers, then the recognizers will tend to work poorly for female speakers (as compared to performance for male speakers).
This thesis focuses on using Bayesian and discriminative training algorithms to improve continuous speech recognition systems in scenarios where there is a limited amount of training data available. The research reported in this thesis can be divided into three categories:
• Overspecialization is characterized by good recognition performance for the data used during training, but poor recognition performance for independent testing data. This is a problem when too little data is available for training purposes. Methods of reducing overspecialization in the minimum classification error algo¬rithm are therefore investigated.
• Development of new Bayesian and discriminative adaptation/training techniques that can be used in situations where there is a small amount of data available. One example here is the situation where an imbalance in terms of numbers of male and female speakers exists and these techniques can be used to improve recognition performance for female speakers, while not decreasing recognition performance for the male speakers.
• Bayesian learning, where Bayesian training is used to improve recognition perfor¬mance in situations where one can only use the limited training data available. These methods are extremely computationally expensive, but are justified by the improved recognition rates for certain tasks. This is, to the author's knowledge, the first time that Bayesian learning using Markov chain Monte Carlo methods have been used in hidden Markov model speech recognition.
The algorithms proposed and reviewed are tested using three different datasets (TIMIT, TIDIGITS and SUNSpeech), with the tasks being connected digit recognition and con¬tinuous speech recognition. Results indicate that the proposed algorithms improve recognition performance significantly for situations where little training data is avail¬able.
© 2001, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
Please cite as follows:
Purnell, DW 2001, Discriminative and Bayesian techniques for hidden Markov model speech recognition systems, PhD thesis, University of Pretoria, Pretoria, viewed yymmdd < http://upetd.up.ac.za/thesis/available/etd-10312005-142207/ >
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access 00front.pdf 459.49 Kb 00:02:07 00:01:05 00:00:57 00:00:28 00:00:02 01chapters1-3.pdf 858.25 Kb 00:03:58 00:02:02 00:01:47 00:00:53 00:00:04 02chapters4-6.pdf 1.44 Mb 00:06:40 00:03:26 00:03:00 00:01:30 00:00:07 03back.pdf 189.67 Kb 00:00:52 00:00:27 00:00:23 00:00:11 00:00:01