• myGriffith
    • Staff portal
    • Contact Us⌄
      • Future student enquiries 1800 677 728
      • Current student enquiries 1800 154 055
      • International enquiries +61 7 3735 6425
      • General enquiries 07 3735 7111
      • Online enquiries
      • Staff phonebook
    View Item 
    •   Home
    • Griffith Research Online
    • Journal articles
    • View Item
    • Home
    • Griffith Research Online
    • Journal articles
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

  • All of Griffith Research Online
    • Communities & Collections
    • Authors
    • By Issue Date
    • Titles
  • This Collection
    • Authors
    • By Issue Date
    • Titles
  • Statistics

  • Most Popular Items
  • Statistics by Country
  • Most Popular Authors
  • Support

  • Contact us
  • FAQs
  • Admin login

  • Login
  • Visual speaker identification and authentication by joint spatiotemporal sparse coding and hierarchical pooling

    Author(s)
    Lai, Jun-Yao
    Wang, Shi-Lin
    Liew, Alan Wee-Chung
    Shi, Xing-Jian
    Griffith University Author(s)
    Liew, Alan Wee-Chung
    Year published
    2016
    Metadata
    Show full item record
    Abstract
    Recent research shows that lip shape and lip movement contain abundant identity-related information and can be used as a new kind of biometrics in speaker identification or authentication. In this paper, we propose a new lip feature representation for lip biometrics which is able to describe the static and dynamic characteristics of a lip sequence. The new representation captures both the physiological and behavioral aspects of the lip and is robust against variations caused by different speaker position and pose. In our approach, a lip sequence is first divided into several subsequences along the temporal dimension. For ...
    View more >
    Recent research shows that lip shape and lip movement contain abundant identity-related information and can be used as a new kind of biometrics in speaker identification or authentication. In this paper, we propose a new lip feature representation for lip biometrics which is able to describe the static and dynamic characteristics of a lip sequence. The new representation captures both the physiological and behavioral aspects of the lip and is robust against variations caused by different speaker position and pose. In our approach, a lip sequence is first divided into several subsequences along the temporal dimension. For each subsequence, sparse coding (SC in short) is adopted to characterize the minutiae of the lip region and its movement in small spatiotemporal cells. Then max-pooling based on a hierarchical spatiotemporal structure is performed on the SC codes to generate the final feature for each of the subsequence. Finally, the entire lip sequence is represented by a set of features corresponding to each subsequence in it. Experiments are carried out on a dataset with 40 speakers and compared with three state-of-the-art approaches. From the experimental results, it was observed that the proposed feature achieved high identification accuracy (an accuracy of 99.96%) and very low authentication error (a Half Total Error Rate (HTER) of 0.46%), and outperformed the other approaches investigated. Moreover, even with random variations caused by different speaker position and pose, the proposed feature still provides good identification (an accuracy of 99.18%) and authentication results (a HTER of 2.34%) and has much lower performance degradation compared with the other approaches investigated. Finally, even when there is only one training sample per speaker, the proposed feature still achieves high discriminative power (an accuracy of 98.39% and HTER of 2.62%).
    View less >
    Journal Title
    Information Sciences
    Volume
    373
    DOI
    https://doi.org/10.1016/j.ins.2016.09.015
    Subject
    Mathematical sciences
    Engineering
    Publication URI
    http://hdl.handle.net/10072/143050
    Collection
    • Journal articles

    Footer

    Disclaimer

    • Privacy policy
    • Copyright matters
    • CRICOS Provider - 00233E
    • TEQSA: PRV12076

    Tagline

    • Gold Coast
    • Logan
    • Brisbane - Queensland, Australia
    First Peoples of Australia
    • Aboriginal
    • Torres Strait Islander