What question did this study set out to answer?

The aim is to develop a mobile solution for detecting objects, measuring distances, and generating contextual expressions.

April 10, 2026

Deep Learning-Based Object Detection With Mobile Application and Expression Generation Using a Large Language Model

Key Points

The aim is to develop a mobile solution for detecting objects, measuring distances, and generating contextual expressions.
Developed a mobile application for real-time object detection using YOLOv11.
Utilized LiDAR for accurate distance measurement of objects.
Implemented GPT-4o for generating contextual expressions based on detected objects and their surroundings.
Integrated image capturing to avoid blurring during the detection process.
Achieved an F1 score of 0.77 and a mAP value of 0.806 with the YOLOv11 model.
Improved users' understanding of spatial relationships and object locations through integrated technologies.
Generated expressions that describe surrounding objects from the fine-tuned GPT-4o model.

Abstract

This work presents an integrated mobile solution that allows users to detect objects in their environment, measure their distances, and understand the spatial relationships between them. The system combines YOLOv11-based real-time object detection, LiDAR-assisted distance measurement, and GPT-4o expression generation, allowing users to locate desired objects and learn about nearby objects. This allows the user to understand not only the presence of objects but also their locations and their spatial relationships. In this study, images are captured with a mobile application during object detection, ensuring that the object is always within the frame. This prevents problems such as blurring and incorrect framing, which are frequently encountered in photos created by visually impaired users. Experimental results show that the YOLOv11 model demonstrates effective performance with an F1 score of 0.77 and a mAP value of 0.806. Furthermore, the fine-tuned GPT-4o model identifies object locations in images and generates expressions that include other surrounding objects. The present work proposes a system that integrates object detection, LiDAR-based distance measurement, and expression generation from a large language model. It provides a reference for the implementation of more advanced solutions in the future.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Nurcihan DERE

Kazım Yıldız

Önder Demir

Journals

Journal of Naval Sciences and Engineering

Actions

Institutions

Marmara University

Daiseung Medics (South Korea)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Deep Learning-Based Object Detection With Mobile Application and Expression Generation Using a Large Language Model

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study