• myGriffith
    • Staff portal
    • Contact Us⌄
      • Future student enquiries 1800 677 728
      • Current student enquiries 1800 154 055
      • International enquiries +61 7 3735 6425
      • General enquiries 07 3735 7111
      • Online enquiries
      • Staff phonebook
    View Item 
    •   Home
    • Griffith Research Online
    • Conference outputs
    • View Item
    • Home
    • Griffith Research Online
    • Conference outputs
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

  • All of Griffith Research Online
    • Communities & Collections
    • Authors
    • By Issue Date
    • Titles
  • This Collection
    • Authors
    • By Issue Date
    • Titles
  • Statistics

  • Most Popular Items
  • Statistics by Country
  • Most Popular Authors
  • Support

  • Contact us
  • FAQs
  • Admin login

  • Login
  • Automatic Lipreading with Limited Training Data

    Thumbnail
    View/Open
    43413_1.pdf (117.6Kb)
    Author(s)
    Wang, SL
    Lau, WH
    Liew, AWC
    Leung, SH
    Griffith University Author(s)
    Liew, Alan Wee-Chung
    Year published
    2006
    Metadata
    Show full item record
    Abstract
    Speech recognition solely based on visual information such as the lip shape and its movement is referred to as lipreading. This paper presents an automatic lipreading technique for speaker dependent (SD) and speaker independent (SI) speech recognition tasks. Since the visual features are derived according to the frame rate of the video sequence, spline representation is then employed to translate the discrete-time sampled visual features into continuous domain. The spline coefficients in the same word class are constrained to have similar expression and can be estimated from the training data by the EM algorithm. In addition, ...
    View more >
    Speech recognition solely based on visual information such as the lip shape and its movement is referred to as lipreading. This paper presents an automatic lipreading technique for speaker dependent (SD) and speaker independent (SI) speech recognition tasks. Since the visual features are derived according to the frame rate of the video sequence, spline representation is then employed to translate the discrete-time sampled visual features into continuous domain. The spline coefficients in the same word class are constrained to have similar expression and can be estimated from the training data by the EM algorithm. In addition, an adaptive multi-model approach is proposed to overcome the variation caused by different speaking style in speaker-independent recognition task. The experiments are carried out to recognize the ten English digits and an accuracy of 96% for speaker dependent recognition and 88% for speaker independent recognition have been achieved, which shows the superiority of our approach compared with other classifiers investigated.
    View less >
    Conference Title
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS
    Volume
    3
    DOI
    https://doi.org/10.1109/ICPR.2006.301
    Copyright Statement
    © 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
    Publication URI
    http://hdl.handle.net/10072/24387
    Collection
    • Conference outputs

    Footer

    Disclaimer

    • Privacy policy
    • Copyright matters
    • CRICOS Provider - 00233E
    • TEQSA: PRV12076

    Tagline

    • Gold Coast
    • Logan
    • Brisbane - Queensland, Australia
    First Peoples of Australia
    • Aboriginal
    • Torres Strait Islander