A Dynamic Variational Framework for Open-World Node Classification in Structured Sequences

No Thumbnail Available
File version
Author(s)
Zhang, Q
Li, Q
Chen, X
Zhang, P
Pan, S
Fournier-Viger, P
Huang, JZ
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2022
Size
File type(s)
Location

Orlando, USA

License
Abstract

Structured sequences are a popular data representation, used to model complex data such as traffic networks. A key machine learning task for structured sequences is node classification, that is predicting the class labels of unlabeled nodes. Though many node classification models were proposed, they assume a closed world setting, that all class labels appear in the training data. But in the real-world, the presence of never-before-seen class labels in testing data can considerably degrade a classifier's accuracy. A promising solution to this issue is to build classifiers for an open-world setting, where samples with unknown class labels are continuously observed such that training and testing data may have different class label spaces. Several approaches have been proposed for open-world learning problems in computer vision and natural language processing, but they cannot be applied directly to structured sequences due to the complexity of their non-Euclidean properties and their dynamic nature. This paper addresses this important research gap by proposing a novel Open-world Structured Sequence node Classification (OSSC) model, to learn from structured sequences in an open-world setting. OSSC captures the structural and temporal information via a GCN-based dynamic variational framework. A latent distribution sequence is learned for each node using both stochastic states and deterministic states, to capture the evolution of node attributes and topology, followed by a sampling process to generate node representations. An open-world classification loss is further adopted to ensure that node representations are sensitive to unknown classes. And a combination of Openmax and Softmax is utilized to recognize nodes from unknown classes and to classify others to one of the known classes. Experiments on real-world datasets show that the proposed OSSC method is capable of learning accurate open-world node classifiers from structured sequence data.

Journal Title
Conference Title

2022 IEEE International Conference on Data Mining (ICDM)

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Machine learning

Data mining and knowledge discovery

Persistent link to this record
Citation

Zhang, Q; Li, Q; Chen, X; Zhang, P; Pan, S; Fournier-Viger, P; Huang, JZ, A Dynamic Variational Framework for Open-World Node Classification in Structured Sequences, Proceedings - IEEE International Conference on Data Mining, ICDM, 2022, pp. 703-712