Maximum Likelihood Sub-band Weighting for Robust Speech Recognition

No Thumbnail Available
File version
Author(s)
Zhu, D
Nakamura, S
Paliwal, KK
Wang, R
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Andrzej Drygajlo (Technical Program Chair)

Date
2003
Size
File type(s)
Location

Geneva, Switzerland

License
Abstract

Sub-band speech recognition approaches have been proposed for robust speech recognition, where full-band power spectra are divided into several sub-bands and then likelihoods or cepstral vectors of the sub-bands are merged depending on their reliability. In conventional sub-band approaches, correlations across the sub-bands are not modeled and the merging weights can only be set experientially or estimated during training procedures, which may not match observed data. The methods further degrade performance for clean speech. We proposed a novel sub-band approach, where frequency sub-bands are multiplied with weighting factors and merged, which considers sub-band dependence and proves to be more robust than both full-band and conventional sub-band approaches. And further the weighting factors can be obtained by using the maximum-likelihood estimation approaches in order to minimize the mismatch between the trained models and the observed features. Finally we evaluated our methods on both the Aurora2 task and the Resource Management task and showed improvement of performance on the two tasks consistently.

Journal Title
Conference Title

EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation