February 27, 2024Open Access

타워: 번역 관련 작업을 위한 공개 다국어 대형 언어 모델

Key Points

Key points are not available for this paper at this time.

Abstract

범용 대형 언어 모델(LLM)은 번역 분야 내 여러 작업에서 능숙함을 보여주지만, 공개 LLM 기반 접근법은 단일 작업에 특화되었을 때만 경쟁력이 있습니다. 본 논문에서는 번역 워크플로우 내 여러 작업에 LLM을 맞춤화하는 방법을 제안합니다. 단일언어 및 병렬 데이터의 다국어 혼합에 대해 계속된 사전학습을 수행하여 TowerBase를 만들고, 번역 과정에 적합한 지침에 대해 미세조정하여 TowerInstruct를 만듭니다. 최종 모델은 번역 워크플로우 관련 여러 작업에서 공개 대안들을 능가하며, 범용 폐쇄형 LLM과도 경쟁력을 갖춥니다. 향후 연구를 지원하기 위해 Tower 모델, 특화 데이터셋, 번역 생태계에 중점을 둔 LLM 평가 프레임워크, 그리고 벤치마크에서 우리의 모델을 포함한 모델 생성 집합을 공개합니다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Pierre Colombo

Duarte Alves

José P. Pombal

Actions

Institutions

Mathématiques et Informatique pour la Complexité et les Systèmes

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

타워: 번역 관련 작업을 위한 공개 다국어 대형 언어 모델

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider