This work presents an integrated mobile solution that allows users to detect objects in their environment, measure their distances, and understand the spatial relationships between them. The system combines YOLOv11-based real-time object detection, LiDAR-assisted distance measurement, and GPT-4o expression generation, allowing users to locate desired objects and learn about nearby objects. This allows the user to understand not only the presence of objects but also their locations and their spatial relationships. In this study, images are captured with a mobile application during object detection, ensuring that the object is always within the frame. This prevents problems such as blurring and incorrect framing, which are frequently encountered in photos created by visually impaired users. Experimental results show that the YOLOv11 model demonstrates effective performance with an F1 score of 0.77 and a mAP value of 0.806. Furthermore, the fine-tuned GPT-4o model identifies object locations in images and generates expressions that include other surrounding objects. The present work proposes a system that integrates object detection, LiDAR-based distance measurement, and expression generation from a large language model. It provides a reference for the implementation of more advanced solutions in the future.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nurcihan DERE
Kazım Yıldız
Önder Demir
Journal of Naval Sciences and Engineering
Marmara University
Daiseung Medics (South Korea)
Building similarity graph...
Analyzing shared references across papers
Loading...
DERE et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69d895ea6c1944d70ce0714e — DOI: https://doi.org/10.56850/jnse.1828189