What type of study is this?

This is a Experimental Study study.

October 2, 2025Open Access

Tactile-VLA: Unlocking Vision-Language-Action Model's Physical Knowledge for Tactile Generalization

Key Points

Tactile-VLA enables robots to follow tactile-aware instructions effectively, enhancing interaction precision.
The framework utilizes commonsense knowledge related to tactile feedback, improving generalization in contact-rich tasks.
Tactile-VLA incorporates a hybrid controller for precision in physical actions based on tactile sensor inputs.
The model achieves zero-shot generalization by activating prior knowledge through minimal demonstrations.

Abstract

Vision-Language-Action (VLA) models have shown remarkable achievements, driven by the rich implicit knowledge of their vision-language components. However, achieving generalist robotic agents demands precise grounding into physical interactions, especially in contact-rich scenarios where fine-grained force control is essential. We advance VLAs' implicit knowledge beyond identifying what to do, towards guiding how to physically interact with real world. This paper introduces Tactile-VLA, a novel framework that deeply fuses vision, language, action, and tactile sensing. This framework incorporates a hybrid position-force controller to translate the model's intentions into precise physical actions and a reasoning module that allows the robot to adapt its strategy based on tactile feedback. Experiments demonstrate Tactile-VLA's effectiveness and generalizability in three key aspects: (1) enabling tactile-aware instruction following, (2) utilizing tactile-relevant commonsense, and (3) facilitating adaptive tactile-involved reasoning. A key finding is that the VLM's prior knowledge already contains semantic understanding of physical interaction; by connecting it to the robot's tactile sensors with only a few demonstrations, we can activate this prior knowledge to achieve zero-shot generalization in contact-rich tasks.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jialei Huang

Shuo Wang

Fanqi Lin

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Tactile-VLA: Unlocking Vision-Language-Action Model's Physical Knowledge for Tactile Generalization

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider