Evaluation of the modified group delay feature for isolated word recognition
MetadataShow full item record
The results of our recent human perception experiments indicate that the short-time phase spectrum can significantly contribute to speech intelligibility over small window durations (i.e., 20-40 ms). This motivates us to investigate the use of the short-time phase spectrum to derive features for automatic speech recognition, which generally uses small window durations of 20-40 ms for spectral analysis. In this paper, we specifically investigate the frequency-derivative of the short-time phase spectrum (i.e., group delay function, GDF) from which to extract features. We demonstrate, with some simple examples, the volatility of the GDF to noise, pitch epochs and windowing effects. We summarise the work by Yegnanarayana and Murthy on the modified GDF (MGDF), which serves to remedy the problems of the GDF. We then implement Murthy and Gadde's MGDF-based features (MODGDF) to determine if they provide an improvement over the popular MFCC representation either by themselves or in combination with MFCCs on an isolated word recognition task.
The 8th International Symposium on Signal Processing and Its Applications (ISSPA-2005)
Copyright 2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.