Key points are not available for this paper at this time.
This research introduces a novel method for zero-shot object navigation, enabling agents to navigate unexplored environments. Our approach differs from traditional methods, which often fail in new settings due to their dependence on large navigation datasets for training. We use Large Vision Language Models (LVLMs) to help agents understand and move through unfamiliar visual environments without prior experience. The process involves using a pretrained LVLM for object detection to create a semantic map, followed by employing LVLM again to predict the likely location of the target object. Our experiments on the RoboTHOR benchmark show improved performance, with a 1.8% increase in both Success Rate and Success Weighted by Path Length (SPL) compared to the existing best method, ESC.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yuan et al. (Thu,) studied this question.
www.synapsesocial.com/papers/68e781e8b6db6435876f4b8d — DOI: https://doi.org/10.1109/icara60736.2024.10553173
Shuaihang Yuan
Muhammad Shafique
Mohamed Baghdadi
New York University
Centre for Artificial Intelligence and Robotics
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 4 closely related papers on similar clinical questions. Consider them for comparative context: