What question did this study set out to answer?

To investigate sensor fusion techniques for improving the reliability of pedestrian detection and human pose estimation in automotive environments.

March 18, 2026Open Access

Fusion in Object Detection and Human Pose Estimation for Automotive Scene Understanding

Key Points

To investigate sensor fusion techniques for improving the reliability of pedestrian detection and human pose estimation in automotive environments.
Explored geometric fusion using camera and lidar for 3D detection.
Introduced new loss functions for monocular 3D detection.
Developed a calibration-free learned fusion approach using self-attention.
Implemented temporal fusion to improve object detection across frames.
Achieved significant reduction in center depth error for long-range pedestrian detection.
Demonstrated robustness against random translation and rotation with calibration-free methods.
Enhanced object permanence in detection, reducing misses due to occlusion.

Abstract

The automotive sector has seen a surge in demand for increased safety, comfort, and flexibility, driven by the growing popularity of advanced driver assistance systems (ADAS). Detection of other traffic participants in 2D and 3D is essential to avoid accidents and ensure safety. Despite its importance, there has been limited research on reliable pedestrian detection for automotive, particularly when pedestrians are at a distance, e.g., 20-50 meters. This thesis explores different sensor fusion approaches to achieve optimal performance in object detection and human pose estimation, considering the specific strengths and weaknesses of the used sensors. The goal is to determine when and why to use specific fusion methods to achieve reliable perception of the environment, which is critical for ensuring safety and preventing accidents. For monocular 3D detection a 3D decoder and new loss functions are introduced, to achieve state-of-the-art performance and understand the limitations and advantages of RGB-only setups. However, it is limited by the depth ambiguity, where objects at different distances appear similar in the image. Geometric fusion using camera and lidar sensors overcomes this limitation. An approach for long range pedestrian detection (LRPD) focuses on maintaining high performance at long ranges. Showcasing the robustness and versatility of geometric fusion, an approach for human pose estimation using RGB and lidar (HPERL) is developed. A detailed evaluation attributes the gains to depth perception with a significant reduction in center depth error. To address the requirement for complex calibration, a novel calibration-free learned fusion approach is introduced. The approach is able to learn fusion of features, using self-attention. As a result, the approach has strong robustness against random translation and rotation, since it does not depend on the exact sensor alignment like calibration based approaches. Finally, temporal fusion is explored to overcome missing object permanence in current object detectors. The proposed integrated object permanence (IOP) uses predictions of previous frames as priors for the current frame, enabling more reliable detection, even when objects are partially or briefly occluded. Highlighting the importance of sensor fusion in autonomous driving, this work reveals suitability of fusion for various use-cases. Geometric fusion achieves optimal performance, while learned fusion provides calibration-free solutions. Temporal fusion addresses the issue of missing object permanence.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

David Michael Fürst

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Fusion in Object Detection and Human Pose Estimation for Automotive Scene Understanding

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study