What type of study is this?

This is a Quantitative Study study.

October 20, 2025Open Access

AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

Key Points

AudioRole provides over 1M character-grounded dialogues from 13 TV series, enabling enhanced role-playing in models.
The ARP-Model trained on AudioRole achieves an Acoustic Personalization score of 0.31, significantly surpassing previous models.
Using a dual-aspect evaluation framework, AudioRole uniquely assesses response quality alongside role fidelity.
This dataset equips researchers with a vital resource, crucial for pushing the boundaries of audio-grounded role-playing.

Abstract

The creation of high-quality multimodal datasets remains fundamental for advancing role-playing capabilities in large language models (LLMs). While existing works predominantly focus on text-based persona simulation, Audio Role-Playing (ARP) presents unique challenges due to the need for synchronized alignment of semantic content and vocal characteristics. To address this gap, we propose AudioRole, a meticulously curated dataset from 13 TV series spanning 1K+ hours with 1M+ character-grounded dialogues, providing synchronized audio-text pairs annotated with speaker identities and contextual metadata. In addition, to demonstrate the effectiveness of the dataset, we introduced ARP-Eval, a dual-aspect evaluation framework that assesses both response quality and role fidelity. Empirical validation showing GLM-4-Voice trained on AudioRole (which we called ARP-Model) achieve an average Acoustic Personalization score of 0.31, significantly outperforming the original GLM-4-voice and the more powerful model MiniCPM-O-2.6, which specifically supports role-playing in one-shot scenarios. The ARP-Model also achieves a Content Personalization score of 0.36, surpassing the untrained original model by about 38% and maintaining the same level as MiniCPM-O-2.6. AudioRole features dialogues from over 115 main characters, 6 trained ARP-Models that role-play different characters, and evaluation protocols. Together, they provide an essential resource for advancing audio-grounded role-playing research.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Wenyu Li

Xiaoqi Jiao

Yi Chang

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider