Maximum Likelihood Sub-band Weighting for Robust Speech Recognition
File version
Author(s)
Nakamura, S
Paliwal, KK
Wang, R
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Andrzej Drygajlo (Technical Program Chair)
Date
Size
File type(s)
Location
Geneva, Switzerland
License
Abstract
Sub-band speech recognition approaches have been proposed for robust speech recognition, where full-band power spectra are divided into several sub-bands and then likelihoods or cepstral vectors of the sub-bands are merged depending on their reliability. In conventional sub-band approaches, correlations across the sub-bands are not modeled and the merging weights can only be set experientially or estimated during training procedures, which may not match observed data. The methods further degrade performance for clean speech. We proposed a novel sub-band approach, where frequency sub-bands are multiplied with weighting factors and merged, which considers sub-band dependence and proves to be more robust than both full-band and conventional sub-band approaches. And further the weighting factors can be obtained by using the maximum-likelihood estimation approaches in order to minimize the mismatch between the trained models and the observed features. Finally we evaluated our methods on both the Aurora2 task and the Resource Management task and showed improvement of performance on the two tasks consistently.
Journal Title
Conference Title
EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology