Robust Speech Recognition in Adverse Environments

Loading...
Thumbnail Image
File version
Primary Supervisor

Paliwal, Kuldip K.

Other Supervisors
Editor(s)
Date
2000
Size
File type(s)
Location
License
Abstract

The performance of an automatic speech recognition system degrades drastically when there is a mismatch between training and testing environments. The aim of robust speech recognition is to overcome this mismatch. Numerous methods have been reported in the literature that attempt to provide robustness to this mismatch. This thesis investigates several different techniques at different stages of the recognition process that are suitable for robust speech recognition. All experiments are conducted on the ISOLET database. The TIMIT database was also used to confirm some of the experimental results. A number of speech enhancement techniques have been used in the past for speech recognition to achieve robustness with respect to noise. A speech enhancement system attempts to reduce noise from the noisy speech signal and is used as a pre-processor to a speech recogniser. In this thesis, a singular value decomposition (SVD) based speech enhancement method is used for robust speech recognition. The speech recognition performance of the SVD method is compared to that of the popular spectral subtraction method. Speech recognition performance is directly affected by the performance of the feature extraction stage. This thesis provides a comprehensive evaluation of a number of acoustic front-ends for robust speech recognition. It also investigates the use of human auditory properties for robust feature extraction. Two acoustic front-ends based on simultaneous masking and variable frequency and temporal resolutions are proposed and their performance is investigated for speech distorted by additive noise and channel distortion. This thesis also investigates the degradation in speech recognition performance due to speech coding distortion. For this, seven different speech coders operating at different bit rates are simulated and the speech recogniser is utilised through each of these coders. The MAP adaptation technique is then applied to adapt the model parameters to the speech coding environment. The resulting system is found to perform well in the presence of the speech coding distortion.

Journal Title
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type

Thesis (PhD Doctorate)

Degree Program

Doctor of Philosophy (PhD)

School

School of Microelectronic Engineering

Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

The author owns the copyright in this thesis, unless stated otherwise.

Item Access Status
Note
Access the data
Related item(s)
Subject

Robust speech recognition

Singular value decomposition

Speech enhancement

Spectral subtraction

Speech coding distortion

Persistent link to this record
Citation