Show simple item record

dc.contributor.authorLi, Hongfeng
dc.contributor.authorXu, Yanyan
dc.contributor.authorKe, Dengfeng
dc.contributor.authorSu, Kaile
dc.date.accessioned2021-01-12T23:56:25Z
dc.date.available2021-01-12T23:56:25Z
dc.date.issued2020
dc.identifier.issn0893-6080
dc.identifier.doi10.1016/j.neunet.2020.12.017
dc.identifier.urihttp://hdl.handle.net/10072/400963
dc.description.abstractThe goal of monaural speech enhancement is to separate clean speech from noisy speech. Recently, many studies have employed generative adversarial networks (GAN) to deal with monaural speech enhancement tasks. When using generative adversarial networks for this task, the output of the generator is a speech waveform or a spectrum, such as a magnitude spectrum, a mel-spectrum or a complex-valued spectrum. The spectra generated by current speech enhancement methods in the time-frequency domain usually lack details, such as consonants and harmonics with low energy. In this paper, we propose a new type of adversarial training framework for spectrum generation, named μ-law spectrum generative adversarial networks (μ-law SGAN). We introduce a trainable μ-law spectrum compression layer (USCL) into the proposed discriminator to compress the dynamic range of the spectrum. As a result, the compressed spectrum can display more detailed information. In addition, we use the spectrum transformed by USCL to regularize the generator's training, so that the generator can pay more attention to the details of the spectrum. Experimental results on the open dataset Voice Bank + DEMAND show that μ-law SGAN is an effective generative adversarial architecture for speech enhancement. Moreover, visual spectrogram analysis suggests that μ-law SGAN pays more attention to the enhancement of low energy harmonics and consonants.
dc.description.peerreviewedYes
dc.languageEnglish
dc.language.isoeng
dc.publisherElsevier
dc.relation.ispartofpagefrom17
dc.relation.ispartofpageto27
dc.relation.ispartofjournalNeural Networks
dc.relation.ispartofvolume136
dc.subject.fieldofresearchNanotechnology
dc.subject.fieldofresearchcode4018
dc.subject.keywords-law SGAN
dc.subject.keywordsDeep neural networks
dc.subject.keywordsGenerative adversarial networks
dc.subject.keywordsSignal processing
dc.subject.keywordsSpeech enhancement
dc.titleμ-law SGAN for generating spectra with more details in speech enhancement
dc.typeJournal article
dc.type.descriptionC1 - Articles
dcterms.bibliographicCitationLi, H; Xu, Y; Ke, D; Su, K, μ-law SGAN for generating spectra with more details in speech enhancement, Neural Networks, 2020, 136, pp. 17-27
dcterms.dateAccepted2020-12-17
dc.date.updated2021-01-12T23:48:56Z
gro.hasfulltextNo Full Text
gro.griffith.authorSu, Kaile


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record