A cross-modal feature aggregation and enhancement network for hyperspectral and LiDAR joint classification

No Thumbnail Available
File version
Author(s)
Zhang, Y
Gao, H
Zhou, J
Zhang, C
Ghamisi, P
Xu, S
Li, C
Zhang, B
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2024
Size
File type(s)
Location
License
Abstract

Advancements in Earth observation technologies have greatly enhanced the potential of integrating hyperspectral (HS) images with Light Detection and Ranging (LiDAR) data for land use and land cover classification. Despite this, most existing methods primarily focus on employing deep network layers to extract features from two heterogeneous data modalities, often overlooking a gradual modeling data representation approach from shallow to deep layers. Furthermore, excessive network layers can result in the deterioration of modality-specific features, therefore lowering the classification performance. The paper proposed a novel cross-modal feature aggregation and enhancement network for the joint classification of HSI and LiDAR data. Initially, a cross-modal feature fusion module is developed to exploit spatial scale consistency to complete the interchange and fusion of feature embedding at the pixel level, preserving the original information from the two heterogeneous modalities to a certain degree. Then two straightforward strategies (i.e., addition and concatenation) are employed in the shallow network layers before being sent to the transformer encoder. The former facilitates the model's ability to discern more subtle distinctions and refine spatial location details. The latter ensures the preservation of information integrity, effectively mitigating the risk of feature loss. Moreover, invertible neural networks and a feature enhancement module are introduced, leveraging the complementary information of HSI and LiDAR data to enhance the detail and texture information extracted in deeper layers. Extensive experiments on Houston2013, Trento, and MUUFL datasets demonstrate that the proposed method outperforms several state-of-the-art models in three evaluation metrics, achieving an accuracy improvement of up to 2%. The proposed model brings new inspirations for HSI and LiDAR classification, which is critical for accurate environmental monitoring, urban planning, and precision agriculture. The source code is publicly accessible at https://github.com/zhangyiyan001/CMFAEN.

Journal Title

Expert Systems with Applications

Conference Title
Book Title
Edition
Volume

258

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Spatial data and applications

Information and computing sciences

Persistent link to this record
Citation

Zhang, Y; Gao, H; Zhou, J; Zhang, C; Ghamisi, P; Xu, S; Li, C; Zhang, B, A cross-modal feature aggregation and enhancement network for hyperspectral and LiDAR joint classification, Expert Systems with Applications, 2024, 258, pp. 125145

Collections