Robust speech recognition using features based on zero crossings with peak amplitudes

View/ Open
Author(s)
Gajic, B
Paliwal, KK
Griffith University Author(s)
Year published
2003
Metadata
Show full item recordAbstract
The paper presents an extensive study of zero crossings with peak amplitudes (ZCPA) features, that have earlier been shown to outperform both conventional and auditory-based features in the presence of additive noise. The study starts by optimizing different parameters involved in ZCPA feature computation, followed by a comparison of ZCPA and MFCC features on two recognition tasks in different background conditions. The main differences between the two feature types are identified, and their individual effects on ASR performance are evaluated. The importance of a proper choice of analysis frame lengths and filter bandwidths ...
View more >The paper presents an extensive study of zero crossings with peak amplitudes (ZCPA) features, that have earlier been shown to outperform both conventional and auditory-based features in the presence of additive noise. The study starts by optimizing different parameters involved in ZCPA feature computation, followed by a comparison of ZCPA and MFCC features on two recognition tasks in different background conditions. The main differences between the two feature types are identified, and their individual effects on ASR performance are evaluated. The importance of a proper choice of analysis frame lengths and filter bandwidths in ZCPA feature extraction is demonstrated. Furthermore, the use of dominant frequency information in ZCPA features is found to be a major reason for increased robustness of ZCPA features compared to MFCC features.
View less >
View more >The paper presents an extensive study of zero crossings with peak amplitudes (ZCPA) features, that have earlier been shown to outperform both conventional and auditory-based features in the presence of additive noise. The study starts by optimizing different parameters involved in ZCPA feature computation, followed by a comparison of ZCPA and MFCC features on two recognition tasks in different background conditions. The main differences between the two feature types are identified, and their individual effects on ASR performance are evaluated. The importance of a proper choice of analysis frame lengths and filter bandwidths in ZCPA feature extraction is demonstrated. Furthermore, the use of dominant frequency information in ZCPA features is found to be a major reason for increased robustness of ZCPA features compared to MFCC features.
View less >
Conference Title
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS
Volume
1
Publisher URI
Copyright Statement
© 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.