What question did this study set out to answer?

The research aims to improve spatial positioning accuracy by integrating uncalibrated video and low-cost GNSS data through a novel deep learning framework.

April 15, 2026Open Access

Self-Supervised Cascade Denoising Auto-Encoder for Accurate Spatial Positioning of Target by Fusing Uncalibrated Video and Low-Cost GNSS

Key Points

The research aims to improve spatial positioning accuracy by integrating uncalibrated video and low-cost GNSS data through a novel deep learning framework.
Developed a self-supervised cascade denoising auto-encoder (SCDAE) for noise robustness.
Introduced Bayesian self-supervised multi-modal fusion positioning method (SCDAE-MFP).
Implemented a visual position denoising module based on dual unsupervised learning.
Conducted experiments on public datasets to validate the method's effectiveness.
Achieved a 56.79% average reduction in positioning errors compared to five baseline methods.
Demonstrated improved accuracy and stability in spatial positioning measurements.

Abstract

Accurate measurement of the spatial position of targets in a fixed camera is critical in remote sensing applications. Visual spatial positioning methods that rely solely on images are susceptible to adverse factors such as inaccurate camera calibration, imprecise image target detection, and incorrect feature point selection. Complementary to images, the ubiquitous Global Navigation Satellite System (GNSS) data can provide spatial positions of targets, but most of them are low-cost GNSSs with significant positioning noise. In order to fuse these two valuable but flawed positioning measurements to improve the accuracy and stability of spatial positioning, we propose a deep learning multi-modal spatial positioning method by fusing sequential uncalibrated video images and low-cost GNSSs. Firstly, a self-supervised cascade denoising auto-encoder (SCDAE) architecture is built to endow the auto-encoder with robustness to noise in the raw inputs. Then, based on the SCDAE and Bayesian optimal estimation, a Bayesian self-supervised multi-modal fusion positioning method SCDAE-MFP is presented to achieve accurate and stable spatial positioning by self-supervised manifold learning. Specifically, to provide visual self-supervision to the SCDAE-MFP, a visual position denoising auto-encoder module based on dual unsupervised learning is proposed. Extensive experimental results on public datasets showed that SCDAE-MFP outperformed five other classical and state-of-the-art baseline methods by an average of 56.79% in reducing positioning errors.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xiaofei Zeng

Ruliang He

Seunghyo Han

Journals

Remote Sensing

Actions

Institutions

Sichuan University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Self-Supervised Cascade Denoising Auto-Encoder for Accurate Spatial Positioning of Target by Fusing Uncalibrated Video and Low-Cost GNSS

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study