Product of power spectrum and group delay function for speech recognition

Loading...
Thumbnail Image
File version
Author(s)
Zhu, DL
Paliwal, KK
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Douglas O'Shaughnessy (General Chair)
Date
2004
Size
16160 bytes
429469 bytes
File type(s)
text/plain
application/pdf
Location
License
Abstract

Mel-frequency cepstral coefficients (MFCCs) are the most widely used features for speech recognition. These are derived from the power spectrum of the speech signal. Recently, the cepstral features derived from the modified group delay function (MGDF) have been studied by Murthy and Gadde [6] for speech recognition. In this paper, we propose to use the product of the power spectrum and the group delay function (GDF), and derive the MFCCs from the product spectrum. This spectrum combines the information from the magnitude spectrum as well as the phase spectrum. The MFCCs of the MGDF are also investigated in this paper. Results show that the cepstral features derived from the power spectrum perform better than that from the MGDF, and the product spectrum based features provide the best performance.

Journal Title
Conference Title
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
© 2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation