Embodied artificial intelligence (AI), which integrates AI and robotics, has made significant progress, particularly in human–robot interaction, task-assisting robots, and the integration of multimodal AI models. Experimental studies have demonstrated strong performance in complex tasks, such as providing human assistance, performing household chores, and object manipulation through pick-and-place operations. However, despite these impressive capabilities, real-world applicability remains limited. While tasks such as household chores and object manipulation offer significant practical utility, users often struggle to provide effective instructions, and execution remains prohibitively slow for real-world deployment. This study introduces an approach to enhance usability through spoken human instructions and reduce operation time by streamlining intermediate steps through our Module Handler. The proposed approach leverages a large language model to extract information from spoken human instructions accurately. Through experiments, we validated the accuracy of our approach and confirmed speed improvements compared with related studies. Our experiments evaluated system accuracy in extracting relevant information from spoken human instruction, achieving an object identification accuracy rate of approximately 92.47%. In addition, our method reduced task completion times by an average of 33 s across four different experimental environments compared with existing modular robotics systems. This time reduction is significant for enhancing robotic task execution efficiency.
Building similarity graph...
Analyzing shared references across papers
Loading...
MinHyuk Kim
J. Park
Kwanyong Park
Sensors
Korea University
Electronics and Telecommunications Research Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Kim et al. (Sat,) studied this question.
synapsesocial.com/papers/69c37bf3b34aaaeb1a67ee69 — DOI: https://doi.org/10.3390/s26061978