Show simple item record

dc.contributor.convenorSoon Hyob Kim and Dae Hee Younen_AU
dc.contributor.authorPaliwal, Kuldipen_US
dc.contributor.authorSo, Stephenen_US
dc.contributor.editorSoon Hyob Kim and Dae Hee Younen_US
dc.date.accessioned2017-04-04T16:59:39Z
dc.date.available2017-04-04T16:59:39Z
dc.date.issued2004en_US
dc.date.modified2009-09-22T05:48:56Z
dc.identifier.doihttp://www.isca-speech.org/archive/interspeech_2004/en_AU
dc.identifier.urihttp://hdl.handle.net/10072/2117
dc.description.abstractIn this paper, we propose the use of the multi-frame Gaussian mixture model-based block quantizer for the coding of Mel frequency-warped cepstral coefficient (MFCC) features in distributed speech recognition (DSR) applications. This coding scheme exploits intraframe correlation via the Karhunen-Loeve transform (KLT) and interframe correlation via the joint processing of adjacent frames together with the computational simplicity of scalar quantization. The proposed coder is bit-rate scalable, which means that the bitrate can be adjusted without the need for re-training of the quantizers. Static parameters such as the probability density function (PDF) model and KLT orthogonal matrices are stored at the encoder and decoder and bit allocations are calculated 'on-the-fly' without intensive processing. This coding scheme is evaluated in this paper on the Aurora-2 database in a DSR framework. It is shown that this coding scheme achieves high recognition performance at lower bitrates, with a word error rate (WER) of 2.5% at 800 bps, which is less than 1% degradation from the baseline word recognition accuracy, and graceful degradation down to a WER of 7% at 300 bps.en_US
dc.description.peerreviewedYesen_US
dc.description.publicationstatusYesen_AU
dc.languageEnglishen_US
dc.language.isoen_AU
dc.publisherSunjin Printing Co.en_US
dc.publisher.placeKoreaen_US
dc.publisher.urihttp://www.isca-speech.org/index.phpen_AU
dc.relation.ispartof0en_AU
dc.relation.ispartofconferencename8th International Conference on Spoken Language Processing (ICSLP-2004)en_US
dc.relation.ispartofconferencetitleInterspeech 2004 (ICSLP)en_US
dc.relation.ispartofdatefrom2004-10-04en_US
dc.relation.ispartofdateto2004-10-08en_US
dc.relation.ispartoflocationJeju, Koreaen_US
dc.subject.fieldofresearchcode280204en_US
dc.subject.fieldofresearchcode280206en_US
dc.titleScalable distributed speech recognition using multi-frame GMM-based block quantizationen_US
dc.typeConference outputen_US
dc.type.descriptionE1 - Conference Publications (HERDC)en_US
dc.type.codeE - Conference Publicationsen_US
gro.facultyGriffith Sciences, Griffith School of Engineeringen_US
gro.date.issued2004
gro.hasfulltextNo Full Text


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • Conference outputs
    Contains papers delivered by Griffith authors at national and international conferences.

Show simple item record