July 13, 2024Open Access

단일언어 기반 모델의 이중언어 적응

Key Points

Key points are not available for this paper at this time.

Abstract

우리는 단일언어 대형 언어 모델(LLM)을 다른 언어에 적응시키는 효율적인 방법을 제시하며, 재앙적 망각(catastrophic forgetting)과 토크나이저 한계 문제에 대응합니다. 이 연구는 Llama 2를 아랍어에 적응시키는 데 중점을 둡니다. 우리의 두 단계 접근법은 어휘를 확장하고 임베딩 행렬만을 학습하는 것으로 시작하며, 이후 이중언어 코퍼스에서 전체 모델 지속 사전학습(continual pre-training)을 진행합니다. 아랍어와 영어 코퍼스를 혼합해 지속적 사전학습을 수행함으로써 모델은 영어 능력을 유지하는 동시에 아랍어 능력을 습득합니다. 우리의 방법은 아랍어 성능에서 상당한 향상을, 영어에서는 약간의 향상을 보여 비용 효율적인 교차 언어 전이를 입증합니다. 임베딩 초기화 기법, 데이터 혼합 비율, 학습률에 대한 소거 실험(ablation study)을 수행하고 상세한 학습 레시피를 공개합니다. 이 접근법의 일반화 가능성을 증명하기 위해 Llama 3 8B를 아랍어에, Llama 2 13B를 힌디어에 적응시키기도 했습니다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Gurpreet Gosal

Yishi Xu

Gokul Ramakrishnan

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

단일언어 기반 모델의 이중언어 적응

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider