On The Use of Discrete Cosine Transform Polarity Spectrum in Speech Enhancement
File version
Author(s)
Busch, Andrew
Paliwal, Kuldip
Fickenscher, Thomas
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Amsterdam, Netherlands
License
Abstract
This paper investigates the use of short-time Discrete Cosine Transform (DCT) for speech enhancement. We denote the absolute values and signs of the DCT spectral coefficients as the Absolute Spectrum (AS) and Polarity Spectrum (PoS), respectively. We theoretically show that the noisy PoS is the best estimate of the original, under the constrained MMSE criterion. To verify this experimentally, the effect of using the noisy PoS for signal resynthesis is analysed through objective and subjective measures. The results show that when the Instantaneous SNR (ISNR) is above 0 dB, deemed as perfect, recovery of the original speech signal can be obtained only by modifying the DCT absolute spectrum. However, an accurate DFT Phase Spectrum (PhS) estimation might be required to achieve the same improvement in perceived speech quality. When the perceived quality is measured against the Segmental SNR (SSNR), it shows the PoS is more capable to conserve the speech quality than the PhS for the same level of global distortion. The results show that the noisy PoS can be used as an estimate of the clean PoS without perceivable degradation in speech quality, only if the ISNR of the noisy speech signal is above 0 dB or the SSNR is above 10.5 dB.
Journal Title
Conference Title
2020 28th European Signal Processing Conference (EUSIPCO)
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Electrical engineering
Electronics, sensors and digital hardware
Speech enhancement
Discrete cosine transform (DCT)
Just noticeable difference (JND)
Persistent link to this record
Citation
Shi, S; Busch, A; Paliwal, K; Fickenscher, T, On The Use of Discrete Cosine Transform Polarity Spectrum in Speech Enhancement, 2020 28th European Signal Processing Conference (EUSIPCO), 2020, pp. 421-425