What type of study is this?

September 10, 2025

Advancing 3D Object Detection with Depth-aware Spatial Knowledge Distillation

Key Points

DK3D improves 3D object detection, reducing depth ambiguity and enhancing feature representation.
With privileged depth information, performance on KITTI and nuScenes benchmarks significantly surpasses existing methods.
The framework utilizes depth-aware techniques and specialized modules for effective knowledge transfer between sensor types.
DK3D serves as a versatile approach, requiring no extra data or computational cost during inference.

Abstract

Accurate 3D object detection from images can be hindered by inherent depth ambiguity. While knowledge distillation (KD) from privileged sensors such as LiDAR offers a promising direction, it often suffers from a critical cross-sensor domain gap. To address this, we introduce DK3D, a novel depth-aware knowledge distillation framework for 3D detection. Our core strategy involves providing the teacher with privileged ground-truth depth during training. This directly avoids the feature representation mismatch and subsequent inefficient knowledge transfer required when distilling from a LiDAR teacher (sparse, geometric) to a camera-based student (dense, semantic). DK3D introduces specialized modules tailored for two primary student paradigms. For depth-assisted models, we employ a channel-wise projection layer (CPL) and an adversarial scoring block (ASB) to align intermediate features at both the pixel and distribution levels. For depth-independent models, a novel vision-depth association module allows the student to implicitly reason about geometry by fusing depth cues with visual features. Both approaches are further enhanced by target-aware spatial response distillation, which captures complex inter-object spatial relationships. Extensive experiments on the KITTI and nuScenes benchmarks demonstrate that DK3D significantly improves performance for both monocular and multi-view 3D detection, outperforming state-of-the-art methods. As a versatile, plug-and-play framework, DK3D boosts existing models without requiring additional training data or increasing the computational cost at inference.

Bookmark

Advancing 3D Object Detection with Depth-aware Spatial Knowledge Distillation

Key Points

Abstract

Cite This Study