What question did this study set out to answer?

The research aims to address challenges in precise grasping for humanoid robots by utilizing a guided tactile multimodal framework.

April 17, 2026Open Access

Adaptive Dexterous Manipulation for Humanoid Robots via a Guided Tactile Multimodal Foundation Model

Key Points

The research aims to address challenges in precise grasping for humanoid robots by utilizing a guided tactile multimodal framework.
Developed the Guided Tactile Multimodal Framework (GTMF) to address perception blind spots and control instabilities.
Utilized vision and tactile sensory data for real-time grasp adjustment in humanoid robots.
Implemented a wearable data collection module to integrate tactile and visual data for analysis.
Applied a guided mask attention network for pose calibration and cross-modal feature alignment.
The framework achieved improved grasping accuracy compared to existing methods using only visual inputs.
Demonstrated the ability to generate tactile feedback from RGB images for real-time adjustments.
Validated the effectiveness of the method through various experimental tests.

Abstract

基于视觉-语言-动作(Vision-Language-Action, VLA)框架的人形机器人灵巧操作模型虽能生成稳定动作轨迹, 但在执行精细抓取时, 仍易因灵巧手的位姿与力控偏差而失败. 针对视觉单模态感知盲区与接触力控制不稳定的双重挑战, 本文提出引导式触觉多模态大模型框架(Guided Tactile Multimodal Framework, GTMF), 通过视触觉联合特征解码触觉阈值, 根据指尖触觉传感器信号, 实现对物体的精细抓取调整. 该方法基于VLA异构泛化能力, 利用引导式掩码注意力网络校准位姿, 并通过视觉-触觉特征跨模态对齐与解耦, 生成场景自适应的触觉阈值, 进而驱动灵巧手实时调整抓取角度, 实现精准闭环控制. 数据采集方面, 本文设计穿戴式视触觉同构数据采集模块, 实现人手触觉、执行器末端视觉与位姿数据的跨具身一体化采集与高效融合. 实验验证了本文方法的有效性, 仅凭视觉输入即可实现触觉生成并达到优于主流方法的抓取精度. 据知, 这是首个仅通过RGB图像解码生成触觉信号, 并用于实时抓取姿态调整的方法, 为VLA框架下灵巧抓取稳定性提供了新的技术路径.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Li et al. (Sun,) studied this question.

www.synapsesocial.com/papers/69e1ce065cdc762e9d85725d — DOI: https://doi.org/10.1360/sst-2025-0312

Authors

Xuetao Li

Nengyuan Pan

Jifeng Xuan

Journals

Scientia Sinica Technologica

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Adaptive Dexterous Manipulation for Humanoid Robots via a Guided Tactile Multimodal Foundation Model

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion