The automotive sector has seen a surge in demand for increased safety, comfort, and flexibility, driven by the growing popularity of advanced driver assistance systems (ADAS). Detection of other traffic participants in 2D and 3D is essential to avoid accidents and ensure safety. Despite its importance, there has been limited research on reliable pedestrian detection for automotive, particularly when pedestrians are at a distance, e.g., 20-50 meters. This thesis explores different sensor fusion approaches to achieve optimal performance in object detection and human pose estimation, considering the specific strengths and weaknesses of the used sensors. The goal is to determine when and why to use specific fusion methods to achieve reliable perception of the environment, which is critical for ensuring safety and preventing accidents. For monocular 3D detection a 3D decoder and new loss functions are introduced, to achieve state-of-the-art performance and understand the limitations and advantages of RGB-only setups. However, it is limited by the depth ambiguity, where objects at different distances appear similar in the image. Geometric fusion using camera and lidar sensors overcomes this limitation. An approach for long range pedestrian detection (LRPD) focuses on maintaining high performance at long ranges. Showcasing the robustness and versatility of geometric fusion, an approach for human pose estimation using RGB and lidar (HPERL) is developed. A detailed evaluation attributes the gains to depth perception with a significant reduction in center depth error. To address the requirement for complex calibration, a novel calibration-free learned fusion approach is introduced. The approach is able to learn fusion of features, using self-attention. As a result, the approach has strong robustness against random translation and rotation, since it does not depend on the exact sensor alignment like calibration based approaches. Finally, temporal fusion is explored to overcome missing object permanence in current object detectors. The proposed integrated object permanence (IOP) uses predictions of previous frames as priors for the current frame, enabling more reliable detection, even when objects are partially or briefly occluded. Highlighting the importance of sensor fusion in autonomous driving, this work reveals suitability of fusion for various use-cases. Geometric fusion achieves optimal performance, while learned fusion provides calibration-free solutions. Temporal fusion addresses the issue of missing object permanence.
Building similarity graph...
Analyzing shared references across papers
Loading...
David Michael Fürst
Building similarity graph...
Analyzing shared references across papers
Loading...
David Michael Fürst (Thu,) studied this question.
www.synapsesocial.com/papers/69ba424e4e9516ffd37a2610 — DOI: https://doi.org/10.26204/kluedo/9716