Key points are not available for this paper at this time.
In this paper, we propose a simple yet effective framework for multilingual end-to-end speech translation (ST), in which speech utterances in source languages are directly translated to the desired target languages with a universal sequence-to-sequence architecture. While multilingual models have shown to be useful for automatic speech recognition (ASR) and machine translation (MT), this is the first time they are applied to the end-to-end ST problem. We show the effectiveness of multilingual end-to-end ST in two scenarios: one-to-many and many-to-many translations with publicly available data. We experimentally confirm that multilingual end-to-end ST models significantly outperform bilingual ones in both scenarios. The generalization of multilingual training is also evaluated in a transfer learning scenario to a very low-resource language pair. All of our codes and the database are publicly available to encourage further research in this emergent multilingual ST topic 11 Available at https://github.com/espnet/espnet..
Building similarity graph...
Analyzing shared references across papers
Loading...
Inaguma et al. (Sun,) studied this question.
www.synapsesocial.com/papers/69fd1b85dc84976daa27004a — DOI: https://doi.org/10.1109/asru46091.2019.9003832
Hirofumi Inaguma
Kevin Duh
Tatsuya Kawahara
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Johns Hopkins University
Kyoto University
Kyoto College of Graduate Studies for Informatics
Building similarity graph...
Analyzing shared references across papers
Loading...