What type of study is this?

September 10, 2025Open Access

MMFNet: A Mamba-Based Multimodal Fusion Network for Semantic Segmentation of Remote Sensing

Key Points

MMFNet achieves mean IoU scores of 83.50% and 86.06% on benchmark datasets, outperforming existing methods.
The dual-encoder design combines ResNet-18 for local detail and VMamba for global context in segmentation.
A frequency-aware upsampling module enhances boundary and spatial detail recovery in the segmentation process.
Extensive experiments validate MMFNet's efficiency and accuracy in multimodal semantic segmentation applications.

Abstract

Accurate semantic segmentation of high-resolution remote sensing imagery is challenged by substantial intra-class variability, inter-class similarity, and the limitations of single-modality data. This paper proposes MMFNet, a novel multimodal fusion network that leverages the Mamba architecture to efficiently capture long-range dependencies for semantic segmentation tasks. MMFNet adopts a dual-encoder design, combining ResNet-18 for local detail extraction and VMamba for global contextual modeling, striking a balance between segmentation accuracy and computational efficiency. A Multimodal Feature Fusion Block (MFFB) is introduced to progressively integrate complementary information from optical imagery and digital surface models (DSMs) via multi-kernel convolution and window-based cross-attention. Furthermore, a frequency-aware upsampling module (FreqFusion) is incorporated in the decoder to enhance boundary delineation and recover fine spatial details. Extensive experiments on the ISPRS Vaihingen and Potsdam benchmarks demonstrate that MMFNet achieves mean IoU scores of 83.50% and 86.06%, outperforming eight state-of-the-art methods while maintaining relatively low computational complexity. These results highlight MMFNet’s potential for efficient and accurate multimodal semantic segmentation in remote sensing applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

J. F. Qiu

Wei Chang

Wei Ren

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

MMFNet: A Mamba-Based Multimodal Fusion Network for Semantic Segmentation of Remote Sensing

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider