On the relative importance of the short-time magnitude and phase spectra towards speaker dependent information
In this work, we investigate the relative contribution of the short-time magnitude and phase spectra towards speaker dependent information. The effect of the analysis window function type is also examined. For this purpose we conduct a human speaker verification experiment that uses phase-only and magnitude-only stimuli. The stimuli are constructed using the analysis-modification-synthesis procedure. The results of our pilot experiment show that the short-time magnitude spectrum contains little speaker information for a low dynamic range analysis window and high amount of speaker information for a large dynamic range window. On the other hand, the short-time phase spectrum contains speaker information predominantly for the low dynamic range analysis window. These suggestive results show that the short-time phase spectrum, commonly discarded in feature extraction for speaker verification, contains useful speaker information. This suggests that further research into feature extraction from the short-time phase spectrum is warranted.
Proceedings of the ISCA Tutorial and Research Workshop (ITRW)