Improving speech enhancement by focusing on smaller values using relative loss

No Thumbnail Available
File version
Author(s)
Li, H
Xu, Y
Ke, D
Su, K
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2020
Size
File type(s)
Location
License
Abstract

The task of single‐channel speech enhancement is to restore clean speech from noisy speech. Recently, speech enhancement has been greatly improved with the introduction of deep learning. Previous work proved that using ideal ratio mask or phase‐sensitive mask as intermediation to recover clean speech can yield better performance. In this case, the mean square error is usually selected as the loss function. However, after conducting experiments, the authors find that the mean square error has a problem. It considers absolute error values, meaning that the gradients of the network depend on absolute differences between estimated values and true values, so the points in magnitude spectra with smaller values contribute little to the gradients. To solve this problem, they propose relative loss, which pays more attention to relative differences between magnitude spectra, rather than the absolute differences, and is more in accordance with human sensory characteristics. The perceptual evaluation of speech quality, the short‐time objective intelligibility, the signal‐to‐distortion ratio, and the segmental signal‐to‐noise ratio are used to evaluate the performance of the relative loss. Experimental results show that it can greatly improve speech enhancement by focusing on smaller values.

Journal Title

IET Signal Processing

Conference Title
Book Title
Edition
Volume

14

Issue

6

Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Electrical engineering

Electronics, sensors and digital hardware

Persistent link to this record
Citation

Li, H; Xu, Y; Ke, D; Su, K, Improving speech enhancement by focusing on smaller values using relative loss, IET Signal Processing, 2020, 14 (6), pp. 374-384

Collections