Achieving truly practical dynamic 3D reconstruction requires online operation, global pose and map consistency, detailed appearance modeling, and the flexibility to handle both RGB and RGB-D inputs. However, existing SLAM methods typically merely remove the dynamic parts or require RGB-D input, while offline methods are not scalable to long video sequences, and current transformer-based feedforward methods lack global consistency and appearance details. To this end, we achieve online dynamic scene reconstruction by disentangling the static and dynamic parts within a SLAM system. The poses are tracked robustly with a novel motion masking strategy, and dynamic parts are reconstructed leveraging a progressive adaptation of a Motion Scaffolds graph. Our method yields novel view renderings competitive to offline methods and achieves on-par tracking with state-of-the-art dynamic SLAM methods.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shi et al. (Mon,) studied this question.
www.synapsesocial.com/papers/68e0450fa99c246f578b40be — DOI: https://doi.org/10.48550/arxiv.2509.17864
Chen Shi
Erik Sandström
Sandro Lombardi
Building similarity graph...
Analyzing shared references across papers
Loading...