Accurate endoscope pose estimation and 3D tissue reconstruction are essential for enhancing navigation and spatial awareness in monocular minimally invasive surgery. However, these tasks remain challenging due to depth ambiguity, physiological tissue deformation, inconsistent endoscope motion, limited texture fidelity, and the restricted field of view. To address these limitations, a unified monocular reconstruction framework is proposed that integrates scale-aware depth prediction with temporally constrained perceptual refinement. The proposed MAPIS-Depth module combines Depth Pro for robust scale initialisation with Depth Anything for efficient per-frame prediction, followed by L-BFGS-B optimisation to obtain pseudo-metric depth. Temporal consistency is further improved using RAFT-based pixel correspondences and LPIPS-guided adaptive blending, reducing artefacts caused by motion and deformation. For reliable registration of the synthesised pseudo-RGBD frames, the WEMA-RTDL module is introduced, which jointly optimises rotation and translation. Finally, truncated signed distance fusion and marching cubes are used to extract coherent 3D tissue surfaces. Experiments on the HEVD and SCARED datasets, supported by ablation studies and comparisons with state-of-the-art methods, demonstrate the robustness and superior accuracy of the proposed approach.
Building similarity graph...
Analyzing shared references across papers
Loading...
Muzammil Khan
Enzo Kerkhof
Matteo Fusaglia
IEEE Access
SHILAP Revista de lepidopterología
The Netherlands Cancer Institute
University of Twente
Building similarity graph...
Analyzing shared references across papers
Loading...
Khan et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69a75d0fc6e9836116a267db — DOI: https://doi.org/10.1109/access.2026.3658399