Multi-label classification via label correlation and first order feature dependance in a data stream

Loading...
Thumbnail Image
File version
post-print
Author(s)
Tien, Thanh Nguyen
Thi, Thu Thuy Nguyen
Anh, Vu Luong
Quoc, Viet Hung Nguyen
Liew, Alan Wee-Chung
Stantic, Bela
Primary Supervisor
Other Supervisors
Editor(s)
Date
2019
Size
File type(s)
Location
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract

Many batch learning algorithms have been introduced for offline multi-label classification (MLC) over the years. However, the increasing data volume in many applications such as social networks, sensor networks, and traffic monitoring has posed many challenges to batch MLC learning. For example, it is often expensive to re-train the model with the newly arrived samples, or it is impractical to learn on the large volume of data at once. The research on incremental learning is therefore applicable to a large volume of data and especially for data stream. In this study, we develop a Bayesian-based method for learning from multi-label data streams by taking into consideration the correlation between pairs of labels and the relationship between label and feature. In our model, not only the label correlation is learned with each arrived sample with ground truth labels but also the number of predicted labels are adjusted based on Hoeffding inequality and the label cardinality. We also extend the model to handle missing values, a problem common in many real-world data. To handle concept drift, we propose a decay mechanism focusing on the age of the arrived samples to incrementally adapt to the change of data. The experimental results show that our method is highly competitive compared to several well-known benchmark algorithms under both the stationary and concept drift settings.

Journal Title
Pattern Recognition
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
© 2019 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Licence (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited.
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Artificial intelligence
Computer vision and multimedia computation
Data management and data science
Machine learning
Persistent link to this record
Citation
Collections