Los puntos clave no están disponibles para este artículo en este momento.
This research work presents a novel Text-to-Speech (TTS) system for the Sylheti Nagri language, a historically significant but endangered script primarily spoken in the Sylhet region of Bangladesh. The Sylheti Nagri language remains widely spoken in daily life, but its written form in the Nagri script has been largely overshadowed by the Bangla script. To address this, we introduce the Sylheti Nagri TTS Corpus, a meticulously curated dataset comprising of over 15 h of high-quality audio recordings. This corpus, the first of its kind for Sylheti Nagri, includes 8268 sentences spoken by a professional voice artist, providing a substantial resource for preserving the language’s oral heritage. We develop an end-to-end TTS model using the Variational Inference Text-to-Speech (VITS) framework, known for its efficiency and high speech quality. Our model achieves a Mean Opinion Score (MOS) of 3.74 (95% Confidence Interval (CI): 3.58, 3.90), reflecting good overall speech quality, and a Perceptual Evaluation of Speech Quality (PESQ) score of 3.12 (95% CI: 2.94, 3.30). This work not only contributes to the preservation of the Sylheti Nagri script but also sets a foundation for future research in TTS technology for under-resourced South Asian languages.
Building similarity graph...
Analyzing shared references across papers
Loading...
Md. Ataullha
Soumik Paul Jisun
M. Shahidur Rahman
Systems and Soft Computing
Shahjalal University of Science and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Ataullha et al. (Fri,) studied this question.
www.synapsesocial.com/papers/6a080acea487c87a6a40cbb0 — DOI: https://doi.org/10.1016/j.sasc.2026.200496