Improving objective intelligibility prediction by combining correlation and coherence based methods with a measure based on the negative distortion ratio
File version
Author(s)
Schwerin, Belinda
Paliwal, Kuldip
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
In this paper we propose a novel objective method for intelligibility prediction of enhanced speech which is based on the negative distortion ratio (NDR) - that is, the amount of power spectra that has been removed in comparison to the original clean speech signal, likely due to a bad noise estimate during the speech enhancement procedure. While negative spectral distortions can have a significant importance in subjective intelligibility assessment of processed speech, most of the objective measures in the literature do not well account for this type of distortion. The proposed method focuses on a very specific type of noise, so it is not intended to be used alone but in combination with other techniques, to jointly achieve a better intelligibility prediction. In order to find an appropriate technique to be combined with, in this paper we also review a number of recently proposed methods based on correlation and coherence measures. These methods have already shown a high correlation with human recognition scores, as they effectively detect the presence of nonlinearities, frequently found in noise-suppressed speech. However, when these techniques are jointly applied with the proposed method, significantly higher correlations (above r = 0.9) are shown to be achieved.
Journal Title
Speech Communication
Conference Title
Book Title
Edition
Volume
54
Issue
3
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Signal processing
Cognitive and computational psychology
Linguistics
Communications engineering
Artificial intelligence