F3Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos

Loading...
Thumbnail Image
File version

Version of Record (VoR)

Author(s)
Liu, Z
Jiang, K
Ma, M
Hou, Z
Lin, Y
Dong, JS
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2025
Size
File type(s)
Location

Singapore, Singapore

Abstract

Analyzing Fast, Frequent, and Fine-grained (F3) events presents a significant challenge in video analytics and multi-modal LLMs. Current methods struggle to identify events that satisfy all the F3 criteria with high accuracy due to challenges such as motion blur and subtle visual discrepancies. To advance research in video understanding, we introduce F3Set, a benchmark that consists of video datasets for precise F3 event detection. Datasets in F3Set are characterized by their extensive scale and comprehensive detail, usually encompassing over 1,000 event types with precise timestamps and supporting multi-level granularity. Currently, F3Set contains several sports datasets, and this framework may be extended to other applications as well. We evaluated popular temporal action understanding methods on F3Set, revealing substantial challenges for existing techniques. Additionally, we propose a new method, F3ED, for F3 event detections, achieving superior performance. The dataset, model, and benchmark code are available at https://github.com/F3Set/F3Set.

Journal Title
Conference Title

ICLR 2025: The Thirteenth International Conference on Learning Representations

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

This publication is distributed under the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/).

Item Access Status
Note

Copyright permissions for this publication were identified from the publisher's website at https://openreview.net/forum?id=vlg5WRKHxh

Access the data
Related item(s)
Subject
Persistent link to this record
Citation

Liu, Z; Jiang, K; Ma, M; Hou, Z; Lin, Y; Dong, JS, F3Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos, ICLR 2025: The Thirteenth International Conference on Learning Representations, 2025