Quantization of Speech Features: Source Coding

No Thumbnail Available
File version
Author(s)
So, Stephen
Paliwal, Kuldip K
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Tan, ZH

Lindberg, B

Date
2008
Size
File type(s)
Location
License
Abstract

In this chapter, we describe various schemes for quantizing speech features to be used in distributed speech recognition (DSR) systems. We analyze the statistical properties of Mel frequency-warped cepstral coefficients (MFCCs) that are most relevant to quantization, namely the correlation and probability density function shape, in order to determine the type of quantization scheme that would be most suitable for quantizing them efficiently. We also determine empirically the relationship between mean squared error and recognition accuracy in order to verify that quantization schemes, which minimize mean squared error, are also guaranteed to improve the recognition performance. Furthermore, we highlight the importance of noise robustness in DSR and describe the use of a perceptually weighted distance measure to enhance spectral peaks in vector quantization. Finally, we present some experimental results on the quantization schemes in a DSR framework and compare their relative recognition performances.

Journal Title
Conference Title
Book Title

Automatic Speech Recognition on Mobile Devices and over Communication Networks

Edition
Volume
Issue
Thesis Type
Degree Program
School
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation
Collections