ZSTAD: Zero-Shot Temporal Activity Detection

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Zhang, L
Chang, X
Liu, J
Luo, M
Wang, S
Ge, Z
Hauptmann, A
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2020
Size
File type(s)
Location

Seattle, USA

License
Abstract

An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos. Currently, the most effective methods of temporal activity detection are based on deep learning, and they typically perform very well with large scale annotated videos for training. However, these methods are limited in real applications due to the unavailable videos about certain activity classes and the time-consuming data annotation. To solve this challenging problem, we propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. The proposed network is optimized with an innovative loss function that considers the embeddings of activity labels and their super-classes while learning the common semantics of seen and unseen activities. Experiments on both the THUMOS'14 and the Charades datasets show promising performance in terms of detecting unseen activities.

Journal Title
Conference Title

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Item Access Status
Note
Access the data
Related item(s)
Subject

Pattern recognition

Persistent link to this record
Citation

Zhang, L; Chang, X; Liu, J; Luo, M; Wang, S; Ge, Z; Hauptmann, A, ZSTAD: Zero-Shot Temporal Activity Detection, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, pp. 876-885