Effect of speech and noise cross correlation on AMFCC speech recognition features
MetadataShow full item record
When designing noise robust speech recognition feature extraction algorithms, it is common to assume that the noise and speech signal are uncorrelated. This assumption allows the cross correlation terms to be ignored in the equations that describe the operation of these algorithms, thus making the mathematics more tractable. In this paper, we investigate the validity of this assumption in the context of the Autocorrelation Mel Frequency Cepstral Coefficient (AMFCC) feature extraction algorithm. To carry out the investigation, we designed a modified AMFCC algorithm that forces the cross terms in the noisy signal autocorrelation equation to be zero. We then compared the performance of the modified algorithm to the un-modified algorithm in recognition experiments performed using the AURORA II database. From these evaluations, we show that the assumption is fair in 5 out of six tested noise cases. The difference in recognition accuracy between the AMFCC and modified AMFCC for these five noises was less than 5%.
ICASSP 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.