Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Liu, Y
Zhou, L
Zhang, P
Bai, X
Gu, L
Yu, X
Zhou, J
Hancock, ER
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2022
Size
File type(s)
Location

Tel Aviv, Israel

License
Abstract

Object categories are often grouped into a multi-granularity taxonomic hierarchy. Classifying objects at coarser-grained hierarchy requires global and common characteristics, while finer-grained hierarchy classification relies on local and discriminative features. Therefore, humans should also subconsciously focus on different object regions when classifying different hierarchies. This granularity-wise attention is confirmed by our collected human real-time gaze data on different hierarchy classifications. To leverage this mechanism, we propose a Cross-Hierarchical Region Feature (CHRF) learning framework. Specifically, we first design a region feature mining module that imitates humans to learn different granularity-wise attention regions with multi-grained classification tasks. To explore how human attention shifts from one hierarchy to another, we further present a cross-hierarchical orthogonal fusion module to enhance the region feature representation by blending the original feature and an orthogonal component extracted from adjacent hierarchies. Experiments on five hierarchical fine-grained datasets demonstrate the effectiveness of CHRF compared with the state-of-the-art methods. Ablation study and visualization results also consistently verify the advantages of our human attention-oriented modules. The code and dataset are available at https://github.com/visiondom/CHRF.

Journal Title
Conference Title

Computer Vision – ECCV 2022

Book Title
Edition
Volume

13684

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2022 Springer, Cham. This is the author-manuscript version of this paper. Reproduced in accordance with the copyright policy of the publisher.The original publication is available at www.springerlink.com

Item Access Status
Note
Access the data
Related item(s)
Subject

Computer vision and multimedia computation

Information and computing sciences

Persistent link to this record
Citation

Liu, Y; Zhou, L; Zhang, P; Bai, X; Gu, L; Yu, X; Zhou, J; Hancock, ER, Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification, Computer Vision – ECCV 2022, 2022, 13684, pp. 57-73