Sparse Coding Based Lip Texture Representation For Visual Speaker Identification
MetadataShow full item record
Recent research has shown that the speaker's lip shape and movement contain rich identity-related information and can be adopted for speaker identification and authentication. Among all the static lip features, the lip texture (intensity variation inside the outer lip contour) is of high discriminative power to differentiate various speakers. However, the existing lip texture feature representations cannot describe the texture information adequately and provide unsatisfactory identification results. In this paper, a sparse representation of the lip texture is proposed and a corresponding visual speaker identification scheme is presented. In the training stage, a sparse dictionary is built based on the texture samples for each speaker. In the testing stage, for any lip image investigated, the lip texture information is extracted and the reconstruction errors using all the dictionaries for every speaker are calculated. The lip image is identified to the speaker with the minimum reconstruction error. The experimental results show that the proposed sparse coding based scheme can achieve much better identification accuracy (91.37% for isolate image and 98.21% for image sequence) compared with several state-of-the-art methods when considering the lip texture information only.
Proceedings of the 19th International Conference on Digital Signal Processing
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.