Gait-Assisted Video Person Retrieval
File version
Author(s)
Wang, X
Yu, X
Liu, C
Gao, Y
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
Video person retrieval aims at matching video clips of the same person across non-overlapping camera views, where video sequences contain more comprehensive information, e.g., temporal cues. How to extract useful temporal cues is the key to the success of a video person retrieval system. Gait, as a unique biometric modality indicating the way people walk, contains informative temporal information. To date, it is not clear how to fully utilize gait to boost the performance of video person retrieval. In this paper, to validate whether gait could help retrieve person in videos, we build a two-stream architecture, named appearance-gait network (AGNet), to jointly learn the appearance features and gait features from RGB video clips and silhouette video clips. We further explore how to fully utilize gait features to enhance the video feature representation. Specifically, we propose an appearance-gait attention module (AGA) to fuse a discriminative feature representation for the person retrieval task. Furthermore, to eliminate the requirement of silhouette video clips during inference, we propose a simple yet effective appearance-gait distillation module (AGD) which transfers the gait knowledge to appearance stream. As such, we are able to perform the enhanced video person retrieval without silhouette video clips, which makes the inference more flexible and practical. To the best of our knowledge, our work is the first to successfully introduce such appearance-gait knowledge distillation design for video person retrieval. We verify the effectiveness of the proposed methods on two large-scale challenging benchmarks of MARS and DukeMTMC-VideoReID. Extensive experiments demonstrate superior or comparable performance compared to the state-of-the-art methods while being much simpler. Source code is publicly available1.
Journal Title
IEEE Transactions on Circuits and Systems for Video Technology
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
This publication has been entered in Griffith Research Online as an advanced online version.
Access the data
Related item(s)
Subject
Image and video coding
Communications engineering
Electronics, sensors and digital hardware
Computer vision and multimedia computation
Persistent link to this record
Citation
Zhao, Y; Wang, X; Yu, X; Liu, C; Gao, Y, Gait-Assisted Video Person Retrieval, IEEE Transactions on Circuits and Systems for Video Technology, 2022