HiMTM: Hierarchical Multi-Scale Masked Time Series Modeling with Self-Distillation for Long-Term Forecasting
File version
Author(s)
Jin, Ming
Hou, Zhaoxiang
Yang, Chengyi
Li, Zengxiang
Wen, Qingsong
Wang, Yi
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Boise, United States
License
Abstract
Time series forecasting is a critical and challenging task in practical application. Recent advancements in pre-trained foundation models for time series forecasting have gained significant interest. However, current methods often overlook the multi-scale nature of time series, which is essential for accurate forecasting. To address this, we propose HiMTM, a hierarchical multi-scale masked time series modeling with self-distillation for long-term forecasting. HiMTM integrates four key components: (1) hierarchical multi-scale transformer (HMT) to capture temporal information at different scales; (2) decoupled encoder-decoder (DED) that directs the encoder towards feature extraction while the decoder focuses on pretext tasks; (3) hierarchical self-distillation (HSD) for multi-stage feature-level supervision signals during pre-training; and (4) cross-scale attention fine-tuning (CSA-FT) to capture dependencies between different scales for downstream tasks. These components collectively enhance multi-scale feature extraction in masked time series modeling, improving forecasting accuracy. Extensive experiments on seven mainstream datasets show that HiMTM surpasses state-of-the-art self-supervised and end-to-end learning methods by a considerable margin of 3.16-68.54%. Additionally, HiMTM outperforms the latest robust self-supervised learning method, PatchTST, in cross-domain forecasting by a significant margin of 2.3%. The effectiveness of HiMTM is further demonstrated through its application in natural gas demand forecasting.
Journal Title
Conference Title
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation
Zhao, S; Jin, M; Hou, Z; Yang, C; Li, Z; Wen, Q; Wang, Y, HiMTM: Hierarchical Multi-Scale Masked Time Series Modeling with Self-Distillation for Long-Term Forecasting, CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024, pp. 3352-3362