Multilingual translation faces challenges of computational redundancy and limited accuracy for low-resource languages, especially in speech translation. To address this, we propose a novel hierarchical Transformer Encoder Tree (TET) combined with non-autoregressive encoder-only models trained with Connectionist Temporal Classification for multilingual translation. By sharing intermediate representations among linguistically similar target languages, TET can improve accuracy on low-resource languages, reduce computational redundancy, and allow generating all target languages in a single forward pass, thus eliminating sequential bottlenecks and improving parallelism. For speech translation, combining TET with a non-autoregressive speech recognition backbone (wav2vec2) shows promising results in terms of translation quality compared to autoregressive systems while being 7-14 times faster.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yiwen Guan
Jacob Whitehill
Building similarity graph...
Analyzing shared references across papers
Loading...
Guan et al. (Mon,) studied this question.
www.synapsesocial.com/papers/68f3793258f37cefb60d340a — DOI: https://doi.org/10.48550/arxiv.2509.17930
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: