Hyperspectral Image Classification Based on Deep Learning and Module Inspired by Human Attention Mechanism
File version
Author(s)
Primary Supervisor
Zhou, Jun
Other Supervisors
Wang, Kewen
Editor(s)
Date
Size
File type(s)
Location
License
Abstract
Hyperspectral imaging technology acquires image data in a number of continuous narrow bands of the electromagnetic wave. The obtained hyperspectral images contain details of spectral re ectance of targets in addition to spatial information. The ability to characterize abundant spectral details of hyperspectral image makes it particularly suitable for remote sensing image analysis. Hyperspectral remote sensing image classi cation is one of the most important applications in remote sensing, and is the main research problem of this thesis. Researchers have already proposed a large variety of methods for hyperspectral image classi cation in the last few decades, which can be categorized into traditional methods and deep learning based methods. Recently, with the development of high performance computing and collection of large datasets, deep learning methods have been state of the art in hyperspectral image classi cation. Most of the existing deep learning methods take in the hyperspectral image and learn discriminant features in plain convolutional or fully connected layers. This learning manner treats all raw pixels and extracted features equally. However, human brains do not perform recognition task with equal consideration of every involved element. For recognition or classi cation tasks, it is possible that some parts of inputs or features are more important, while others are useless. Our visual system has the capability of attending to the signi cant aspects and ignoring irrelevant components. This has greatly contributed to our cognition ability and e ciency. Inspired by the attention mechanism of human brain, we design corresponding attention modules in the context of arti cial neural network for hyperspectral image classi cation. In addition, human visual system is a universal feature extractor and classi er in the sense that we can perform classi cation across multiple image styles, modalities and distributions. On the contrary, current deep learning based hyperspectral classi - cation paradigms require an individual model for every data domain. This is expensive and ine cient. Following similar philosophy of attention mechanism, we design domain attention modules for multi-domain hyperspectral image classi cation. In this thesis, we propose three attention modules for deep learning based hyperspectral image classi cation. In the rst work, we introduce attention based feature weighting networks for improving the classi cation accuracy of current plain neural networks. In a deep network for hyperspectral application, a hierarchy of spectral or spatial features are extracted layer by layer. Each layer contains the same semantic level of features. To model the importance of features in the same level, attention modules are designed by branching from current feature maps. In the attention branch, three steps are executed: summarizing information from current layer, modeling relationship among the features with fully connected or convolution layers, and outputting weighting masks to be multiplied with the original features. We propose feature weighting attention modules for spectral CNN, spatial CNN and spectral-spatial CNN, respectively. In the second work, we design attention modules speci cally attending to the bands of hyperspectral image. Compared to hidden features extracted in hidden layers of neural networks which have less interpretability and physical meaning, spectral bands of hyperspectral images correspond directly to real wavelength in the physical world. Thus attending to bands has special importance in a couple of aspects. First, it in uences the design and cost of hyperspectral sensor. Second, it is directly related to the dimension of the obtained raw data. Our band attention module can perform both band weighting and band selection. For band weighting, it has the ability to assign sample-wise weights to hyperspectral images and can interfere with the feature learning process in the early stage. For band selection, we carefully design an additional parallel input to the attention module for obtaining xed selected band sets and an activation function for ltering insigni cant bands in the training process. In the third work, we propose attention mechanisms to address multi-domain hyperspectral image classi cation. Di erent hyperspectral datasets have di erent data modalities, statistical distributions, or spectral dimensionalities. This brings signi cant challenges for a single network to learn all the tasks. The domain shift problem can be alleviated by adjusting the network towards the property of speci c domains. To this end, domain attention modules are designed to attend to the domain of the input data for adapting the network accordingly. Two domain attention modules: hard domain attention and soft domain attention are proposed. For the hard domain attention network, the attention mechanism is implemented by a muxer switch. According to the labels of data domain, a set of small domain speci c adapters are selected and connected to a main backbone network. In this way, the majority of network parameters are shared by all domains with only a small number of domain speci c parameters. For the soft domain attention network, we build the attention mechanism based on squeeze and excitation (SE) block. Several parallel SE blocks are applied as the feature adapters. On top of them, a higher level domain attention SE block is placed to achieve domain assignment.
Journal Title
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Thesis (PhD Doctorate)
Degree Program
Doctor of Philosophy (PhD)
School
School of Info & Comm Tech
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
The author owns the copyright in this thesis, unless stated otherwise.
Item Access Status
Note
Access the data
Related item(s)
Subject
Remote Sensing
Hyperspectral Image