Show simple item record

dc.contributor.authorWang, S.en_US
dc.contributor.authorLiew, Alan Wee-Chungen_US
dc.contributor.authorLau, W.en_US
dc.contributor.authorLeung, S.en_US
dc.contributor.editorKeshab K Parhien_US
dc.date.accessioned2017-05-03T15:20:44Z
dc.date.available2017-05-03T15:20:44Z
dc.date.issued2008en_US
dc.date.modified2011-10-18T07:26:36Z
dc.identifier.issn10518215en_US
dc.identifier.doi10.1109/TCSVT.2008.2004924en_AU
dc.identifier.urihttp://hdl.handle.net/10072/23604
dc.description.abstractIt is well known that visual cues of lip movement contain important speech relevant information. This paper presents an automatic lipreading system for small vocabulary speech recognition tasks. Using the lip segmentation and modeling techniques we developed earlier, we obtain a visual feature vector composed of outer and inner mouth features from the lip image sequence for recognition. A spline representation is employed to transform the discrete-time sampled features from the video frames into the continuous domain. The spline coefficients in the same word class are constrained to have similar expression and are estimated from the training data by the EM algorithm. For the multiple-speaker/speaker-independent recognition task, an adaptive multimodel approach is proposed to handle the variations caused by various talking styles. After building the appropriate word models from the spline coefficients, a maximum likelihood classification approach is taken for the recognition. Lip image sequences of English digits from 0 to 9 have been collected for the recognition test. Two widely used classification methods, HMM and RDA, have been adopted for comparison and the results demonstrate that the proposed algorithm deliver the best performance among these methods.en_US
dc.description.peerreviewedYesen_US
dc.description.publicationstatusYesen_AU
dc.format.extent215155 bytes
dc.format.mimetypeapplication/pdf
dc.languageEnglishen_US
dc.language.isoen_AU
dc.publisherI E E Een_US
dc.publisher.placeUnited Statesen_US
dc.relation.ispartofstudentpublicationNen_AU
dc.relation.ispartofpagefrom1760en_US
dc.relation.ispartofpageto1765en_US
dc.relation.ispartofissue12en_US
dc.relation.ispartofjournalI E E E Transactions on Circuits and Systems for Video Technologyen_US
dc.relation.ispartofvolume18en_US
dc.rights.retentionYen_AU
dc.subject.fieldofresearchComputer Visionen_US
dc.subject.fieldofresearchPattern Recognition and Data Miningen_US
dc.subject.fieldofresearchcode080104en_US
dc.subject.fieldofresearchcode080109en_US
dc.titleAn Automatic Lipreading System for Spoken Digits With Limited Training Dataen_US
dc.typeJournal articleen_US
dc.type.descriptionC1 - Peer Reviewed (HERDC)en_US
dc.type.codeC - Journal Articlesen_US
gro.rights.copyrightCopyright 2008 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_AU
gro.date.issued2008
gro.hasfulltextFull Text


Files in this item

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record