Advanced techniques for data stream analysis and applications

Loading...
Thumbnail Image
File version
Primary Supervisor

Liew, Wee-Chung

Other Supervisors

Wang, Can

Editor(s)
Date
2023-02-01
Size
File type(s)
Location
License
Abstract

Deep learning (DL) is one of the most advanced AI techniques that has gained much attention in the last decade and has also been applied in many successful applications such as market stock prediction, object detection, and face recognition. The rapid advances in computational techniques like Graphic Processing Units (GPU) and Tensor Processing Units (TPU) have made it possible to train large deep learning models to obtain high accuracy surpassing human ability in some tasks, e.g., LipNet [9] achieves 93% of accuracy compared with 52% of human to recognize the word from speaker lips movement. Most of the current deep learning research work has focused on designing a deep architecture working in a static environment where the whole training set is known in advance. However, in many real-world applications like predicting financial markets, autonomous cars, and sensor networks, the data often comes in the form of streams with massive volume, and high velocity has affected the scalability of deep learning models. Learning from such data is called continual, incremental, or online learning. When learning a deep model in dynamic environments where the data come from streams, the modern deep learning models usually suffer the so-called catastrophic forgetting problem, one of the most challenging issues that have not been solved yet. Catastrophic forgetting occurs when a model learns new knowledge, i.e., new objects or classes, but its performance in the previously learned classes reduces significantly. The cause of catastrophic forgetting in the deep learning model has been identified and is related to the weight-sharing property. In detail, the model updating the corresponding weights to capture knowledge of the new tasks may push the learned weights of the past tasks away and cause the model performance to degrade. According to the stability-plasticity dilemma [17], if the model weights are too stable, it will not be able to acquire new knowledge, while a model with high plasticity can have large weight changes leading to significant forgetting of the previously learned patterns. Many approaches have been proposed to tackle this issue, like imposing constraints on weights (regularizations) or rehearsal from experience, but significant research gap still exists. First, current regularization methods often do not simultaneously consider class imbalance and catastrophic forgetting. Moreover, these methods usually require more memory to store previous versions of the model, which sometimes is not able to hold a copy of a substantial deep model due to memory constraints. Second, existing rehearsal approaches pay little attention to selecting and storing critical instances that help the model to retain as much knowledge of the learned tasks. This study focuses on dealing with these challenges by proposing several novel methods. We first proposed a new loss function that combines two loss terms to deal with class imbalance data and catastrophic forgetting simultaneously. The former is a modification of a widely used loss function for class imbalance learning, called Focal loss, to handle the exploding gradient (loss goes to NaN) and the ability to learn from highly confident data points. At the same time, the latter is a novel loss term that addresses the catastrophic forgetting within the current mini-batch. In addition, we also propose an online convolution neural network (OCNN) architecture for tabular data that act as a base classifier in an ensemble system (OECNN). Next, we introduce a rehearsal-based method to prevent catastrophic forgetting. In which we select a triplet of instances within each mini-batch to store in the memory buffer. We find that these instances are identified as crucial instances that can help either remind the model of easy tasks or revise for the hard ones. We also propose a class-wise forgetting detector that monitors the performance of each class encountered so far in a stream. If a class’s performance drops below a predefined threshold, that class is identified as a forgetting class. Finally, based on the nature of data which often comprises many modalities, we study online multi-modal multi-task (M3T) learning problems. Unlike the traditional methods in stable environments, online M3T learning need to be considered in many scenarios like missing modalities and incremental tasks. We establish the setting for six frequently happened scenarios for M3T. Most of the existing works in M3T fail to run on all of these scenarios. Therefore, we propose a novel M3T deep learning model called UniCNet that can work on all of these scenarios and achieves superior performance compared with state-of-the-art M3T methods. To conclude, this dissertation contributes to novel computational techniques that deal with catastrophic forgetting problem in continual deep learning.

Journal Title
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type

Thesis (PhD Doctorate)

Degree Program

Doctor of Philosophy (PhD)

School

School of Info & Comm Tech

Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

The author owns the copyright in this thesis, unless stated otherwise.

Item Access Status
Note
Access the data
Related item(s)
Subject

deep learning

continual learning

online learning

catastrophic forgetting

Persistent link to this record
Citation