April 21, 2026Open Access

ICASSP 2026 CONVERGE Challenge: Multimodal Fingerprinting for UE Localization

Key Points

Key points are not available for this paper at this time.

Abstract

This paper addresses the limitation of vision-based UE localization, which performs well when the UE is within the camera field of view (FoV) but degrades when the UE moves outside the FoV. To overcome this, we propose a multimodal UE localization approach that jointly exploits 5G radio measurements and visual information from red-green-blue (RGB) frames. Radio, image, and ground-truth data are temporally aligned using time sample information. Compact visual embeddings are extracted using a pre-trained ResNet50 backbone. These embeddings are fused with selected radio features to form a unified fingerprint, which is mapped to the 3D UE position using a multilayer perceptron. Experiments on the ICASSP 2026 CONVERGE Task 2 dataset show that the proposed multimodal approach achieves low localization error and consistently outperforms single-modality baselines.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Saba et al. (Tue,) studied this question.

synapsesocial.com/papers/6a0fd7c8b6f5ee040160052c https://doi.org/https://doi.org/10.1109/icassp55912.2026.11463495

Bookmark

View Full Paper