ASR on speech reconstructed from short-time fourier phase spectra

No Thumbnail Available
File version
Author(s)
Alsteris, LD
Paliwal, KK
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Soon Hyob Kim and Dae Hee Youn

Date
2004
Size
File type(s)
Location

Jeju, Korea

License
Abstract

In our earlier papers, we have measured human intelligibility of speech stimuli reconstructed either from the short-time magnitude spectra (magnitude-only stimuli) or the short-time phase spectra (phase-only stimuli) of a speech stimulus. We demonstrated that, even for small analysis window durations of 20-40 ms (of relevance to automatic speech recognition), the short-time phase spectrum can contribute to speech intelligibility as much as the short-time magnitude spectrum. In this paper, we perform automatic speech recognition on magnitude-only and phase-only stimuli. When employing an MFCC-based front-end, the recognition achieved for these phase-only stimuli is much worse than magnitude-only stimuli at small analysis window durations, which is not consistent with their corresponding human intelligibility results. This implies that the MFCC feature set is not capturing all of the discriminating information present in the speech signal.

Journal Title
Conference Title

8th International Conference on Spoken Language Processing, ICSLP 2004

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation