• myGriffith
    • Staff portal
    • Contact Us⌄
      • Future student enquiries 1800 677 728
      • Current student enquiries 1800 154 055
      • International enquiries +61 7 3735 6425
      • General enquiries 07 3735 7111
      • Online enquiries
      • Staff phonebook
    View Item 
    •   Home
    • Griffith Research Online
    • Journal articles
    • View Item
    • Home
    • Griffith Research Online
    • Journal articles
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

  • All of Griffith Research Online
    • Communities & Collections
    • Authors
    • By Issue Date
    • Titles
  • This Collection
    • Authors
    • By Issue Date
    • Titles
  • Statistics

  • Most Popular Items
  • Statistics by Country
  • Most Popular Authors
  • Support

  • Contact us
  • FAQs
  • Admin login

  • Login
  • μ-law SGAN for generating spectra with more details in speech enhancement

    Author(s)
    Li, Hongfeng
    Xu, Yanyan
    Ke, Dengfeng
    Su, Kaile
    Griffith University Author(s)
    Su, Kaile
    Year published
    2020
    Metadata
    Show full item record
    Abstract
    The goal of monaural speech enhancement is to separate clean speech from noisy speech. Recently, many studies have employed generative adversarial networks (GAN) to deal with monaural speech enhancement tasks. When using generative adversarial networks for this task, the output of the generator is a speech waveform or a spectrum, such as a magnitude spectrum, a mel-spectrum or a complex-valued spectrum. The spectra generated by current speech enhancement methods in the time-frequency domain usually lack details, such as consonants and harmonics with low energy. In this paper, we propose a new type of adversarial training ...
    View more >
    The goal of monaural speech enhancement is to separate clean speech from noisy speech. Recently, many studies have employed generative adversarial networks (GAN) to deal with monaural speech enhancement tasks. When using generative adversarial networks for this task, the output of the generator is a speech waveform or a spectrum, such as a magnitude spectrum, a mel-spectrum or a complex-valued spectrum. The spectra generated by current speech enhancement methods in the time-frequency domain usually lack details, such as consonants and harmonics with low energy. In this paper, we propose a new type of adversarial training framework for spectrum generation, named μ-law spectrum generative adversarial networks (μ-law SGAN). We introduce a trainable μ-law spectrum compression layer (USCL) into the proposed discriminator to compress the dynamic range of the spectrum. As a result, the compressed spectrum can display more detailed information. In addition, we use the spectrum transformed by USCL to regularize the generator's training, so that the generator can pay more attention to the details of the spectrum. Experimental results on the open dataset Voice Bank + DEMAND show that μ-law SGAN is an effective generative adversarial architecture for speech enhancement. Moreover, visual spectrogram analysis suggests that μ-law SGAN pays more attention to the enhancement of low energy harmonics and consonants.
    View less >
    Journal Title
    Neural Networks
    Volume
    136
    DOI
    https://doi.org/10.1016/j.neunet.2020.12.017
    Subject
    Nanotechnology
    -law SGAN
    Deep neural networks
    Generative adversarial networks
    Signal processing
    Speech enhancement
    Publication URI
    http://hdl.handle.net/10072/400963
    Collection
    • Journal articles

    Footer

    Disclaimer

    • Privacy policy
    • Copyright matters
    • CRICOS Provider - 00233E

    Tagline

    • Gold Coast
    • Logan
    • Brisbane - Queensland, Australia
    First Peoples of Australia
    • Aboriginal
    • Torres Strait Islander