Preventing DeepFake Attacks on Speaker Authentication by Dynamic Lip Movement Analysis

No Thumbnail Available
File version
Author(s)
Yang, Chen-Zhao
Ma, Jun
Wang, Shilin
Liew, Alan Wee-Chung
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2021
Size
File type(s)
Location
License
Abstract

Recent research has demonstrated that lip-based speaker authentication systems can not only achieve good authentication performance but also guarantee liveness. However, with modern DeepFake technology, attackers can produce the talking video of a user without leaving any visually noticeable fake traces. This can seriously compromise traditional face-based or lip-based authentication systems. To defend against sophisticated DeepFake attacks, a new visual speaker authentication scheme based on the deep convolutional neural network (DCNN) is proposed in this paper. The proposed network is composed of two functional parts, namely, the Fundamental Feature Extraction network (FFE-Net) and the Representative lip feature extraction and Classification network (RC-Net). The FFE-Net provides the fundamental information for speaker authentication. As the static lip shape and lip appearance is vulnerable to DeepFake attacks, the dynamic lip movement is emphasized in the FFE-Net. The RC-Net extracts high-level lip features that discriminate against human imposters while capturing the client’s talking style. A multi-task learning scheme is designed, and the proposed network is trained end-to-end. Experiments on the GRID and MOBIO datasets have demonstrated that the proposed approach is able to achieve an accurate authentication result against human imposters and is much more robust against DeepFake attacks compared to three state-of-the-art visual speaker authentication algorithms. It is also worth noting that the proposed approach does not require any prior knowledge of the DeepFake spoofing method and thus can be applied to defend against different kinds of DeepFake attacks.

Journal Title

IEEE Transactions on Information Forensics and Security

Conference Title
Book Title
Edition
Volume

16

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Engineering

Persistent link to this record
Citation

Yang, C-Z; Ma, J; Wang, S; Liew, AW-C, Preventing DeepFake Attacks on Speaker Authentication by Dynamic Lip Movement Analysis, IEEE Transactions on Information Forensics and Security, 2021, 16, pp. 1841-1854

Collections