What type of study is this?

This is a Quantitative Study study.

October 20, 2025Open Access

Robust Instant Policy: Leveraging Student's t-Regression Model for Robust In-context Imitation Learning of Robot Manipulation

Key Points

Robust instant policy (RIP) significantly improves trajectory generation, enhancing task success rates by at least 26% in low-data scenarios.
The approach aggregates multiple candidate trajectories using student’s t-distribution to mitigate hallucinated trajectories from large language models.
RIP outperforms state-of-the-art imitation learning methods, showing effective adaptability in both simulated and real-world environments.
Reliability in robotics applications is improved, addressing hallucination issues associated with LLM-based instant policies.

Abstract

Imitation learning (IL) aims to enable robots to perform tasks autonomously by observing a few human demonstrations. Recently, a variant of IL, called In-Context IL, utilized off-the-shelf large language models (LLMs) as instant policies that understand the context from a few given demonstrations to perform a new task, rather than explicitly updating network models with large-scale demonstrations. However, its reliability in the robotics domain is undermined by hallucination issues such as LLM-based instant policy, which occasionally generates poor trajectories that deviate from the given demonstrations. To alleviate this problem, we propose a new robust in-context imitation learning algorithm called the robust instant policy (RIP), which utilizes a Student's t-regression model to be robust against the hallucinated trajectories of instant policies to allow reliable trajectory generation. Specifically, RIP generates several candidate robot trajectories to complete a given task from an LLM and aggregates them using the Student's t-distribution, which is beneficial for ignoring outliers (i. e. , hallucinations) ; thereby, a robust trajectory against hallucinations is generated. Our experiments, conducted in both simulated and real-world environments, show that RIP significantly outperforms state-of-the-art IL methods, with at least 26\% improvement in task success rates, particularly in low-data scenarios for everyday tasks. Video results available at https: //sites. google. com/view/robustinstantpolicy.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Hanbit Oh

Andrea M. Salcedo-Vázquez

Ixchel G. Ramirez-Alpizar

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Robust Instant Policy: Leveraging Student's t-Regression Model for Robust In-context Imitation Learning of Robot Manipulation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider