Abstract Reliable localization is fundamental to navigation in GNSS-challenged environments, yet monocular visual odometry (VO) inevitably suffers from drift without global constraints. Point cloud maps can provide global corrections to mitigate this drift. However, existing heterogeneous feature registration methods fail to fully exploit the shared information between vision and point clouds. This limitation results in suboptimal localization accuracy, reduced robustness, and increased computational overhead. To address these issues, we propose a visual odometry system based on colored point cloud maps, which leverages both the global point cloud map as a constraint and the shared modality (color) between the colored point cloud map and the camera to ensure localization accuracy and robustness. The system consists of two main components: sparse colored point cloud construction and visual localization. In the sparse colored point cloud construction module, we introduce a map sparsification strategy that associates visual gradients with point clouds, ensuring that the retained sparse point cloud preserves salient visual gradient information, thereby reducing computational costs. This strategy is further incorporated into the vision-to-map matching stage, forming a “dual-sparsity matching” scheme. In the localization stage, we propose a hierarchical optimization-based iterative Kalman filtering algorithm, which performs multi-level iterative optimization over multi-resolution images to prevent localization from getting trapped in local optima while enhancing accuracy. Experiments on public and self-collected sequences demonstrate significant gains in both accuracy and efficiency. Compared with a representative map-based method (DSL), the proposed approach reduces ATE (RMSE) by 52\% 52 % – 95\% 95 % across multiple challenging sequences, with a representative improvement from 1. 883 to 0. 152\, m 0. 152 m on R3live₅ (91. 92\% 91. 92 % reduction). Against the learning-based I2D-Loc++, our method achieves accuracy gains of up to 76. 6\% 76. 6 % and maintains robust tracking in extreme cases (e. g. , R3live₃/4) where I2D-Loc++ and ORB-SLAM3 fail due to visual degradation. Furthermore, the system maintains near real-time efficiency, reducing total processing time by up to 47. 7\% 47. 7 % (e. g. , from 121. 11 to 63. 34\, s 63. 34 s on Whu₁). Even under severe geometric degeneracy, our tracker remains robust with an ATE of 0. 076\, m 0. 076 m, whereas geometry-only LiDAR–IMU localization degrades to 9. 23\, m 9. 23 m.
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69e71467cb99343efc98dbef — DOI: https://doi.org/10.1186/s43020-026-00196-x
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Xuanxuan Zhang
You Li
Tianxiang Zhang
Satellite Navigation
Building similarity graph...
Analyzing shared references across papers
Loading...