• myGriffith
    • Staff portal
    • Contact Us⌄
      • Future student enquiries 1800 677 728
      • Current student enquiries 1800 154 055
      • International enquiries +61 7 3735 6425
      • General enquiries 07 3735 7111
      • Online enquiries
      • Staff phonebook
    View Item 
    •   Home
    • Griffith Research Online
    • Journal articles
    • View Item
    • Home
    • Griffith Research Online
    • Journal articles
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

  • All of Griffith Research Online
    • Communities & Collections
    • Authors
    • By Issue Date
    • Titles
  • This Collection
    • Authors
    • By Issue Date
    • Titles
  • Statistics

  • Most Popular Items
  • Statistics by Country
  • Most Popular Authors
  • Support

  • Contact us
  • FAQs
  • Admin login

  • Login
  • Improving speech enhancement by focusing on smaller values using relative loss

    Author(s)
    Li, H
    Xu, Y
    Ke, D
    Su, K
    Griffith University Author(s)
    Su, Kaile
    Year published
    2020
    Metadata
    Show full item record
    Abstract
    The task of single‐channel speech enhancement is to restore clean speech from noisy speech. Recently, speech enhancement has been greatly improved with the introduction of deep learning. Previous work proved that using ideal ratio mask or phase‐sensitive mask as intermediation to recover clean speech can yield better performance. In this case, the mean square error is usually selected as the loss function. However, after conducting experiments, the authors find that the mean square error has a problem. It considers absolute error values, meaning that the gradients of the network depend on absolute differences between estimated ...
    View more >
    The task of single‐channel speech enhancement is to restore clean speech from noisy speech. Recently, speech enhancement has been greatly improved with the introduction of deep learning. Previous work proved that using ideal ratio mask or phase‐sensitive mask as intermediation to recover clean speech can yield better performance. In this case, the mean square error is usually selected as the loss function. However, after conducting experiments, the authors find that the mean square error has a problem. It considers absolute error values, meaning that the gradients of the network depend on absolute differences between estimated values and true values, so the points in magnitude spectra with smaller values contribute little to the gradients. To solve this problem, they propose relative loss, which pays more attention to relative differences between magnitude spectra, rather than the absolute differences, and is more in accordance with human sensory characteristics. The perceptual evaluation of speech quality, the short‐time objective intelligibility, the signal‐to‐distortion ratio, and the segmental signal‐to‐noise ratio are used to evaluate the performance of the relative loss. Experimental results show that it can greatly improve speech enhancement by focusing on smaller values.
    View less >
    Journal Title
    IET Signal Processing
    Volume
    14
    Issue
    6
    DOI
    https://doi.org/10.1049/iet-spr.2019.0290
    Subject
    Electrical and Electronic Engineering
    Publication URI
    http://hdl.handle.net/10072/400964
    Collection
    • Journal articles

    Footer

    Disclaimer

    • Privacy policy
    • Copyright matters
    • CRICOS Provider - 00233E

    Tagline

    • Gold Coast
    • Logan
    • Brisbane - Queensland, Australia
    First Peoples of Australia
    • Aboriginal
    • Torres Strait Islander