Tactile feedback is generally recognized to be crucial for effective interaction with the physical world. However, state-of-the-art Vision-Language-Action (VLA) models lack the ability to interpret and use tactile signals, limiting their effectiveness in contact-rich tasks. Incorporating tactile feedback into these systems is challenging due to the absence of large multi-modal datasets. We present VLA-Touch, an approach that enhances generalist robot policies with tactile sensing without fine-tuning the base VLA. Our method introduces two key innovations: (1) a pipeline that leverages a pretrained tactile-language model that provides semantic tactile feedback for high-level task planning, and (2) a diffusion-based controller that refines VLA-generated actions with tactile signals for contact-rich manipulation. Through real-world experiments, we demonstrate that our dual-level integration of tactile feedback improves task planning efficiency while enhancing execution precision. Code is open-sourced at https: //github. com/jxbi1010/VLA-Touchthis URL.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jianxin Bi
Kevin Ma
Ce Hao
Building similarity graph...
Analyzing shared references across papers
Loading...
Bi et al. (Wed,) studied this question.
www.synapsesocial.com/papers/68e6679587ecc93a24d1757e — DOI: https://doi.org/10.48550/arxiv.2507.17294