August 8, 2024Open Access

LLM 미세 조정의 성능 이해 및 비용 추정

Key Points

Key points are not available for this paper at this time.

Abstract

대규모 언어 모델(LLM) 훈련의 높은 비용 때문에, 제한된 계산 자원을 사용하여 비용 효율적으로 특정 작업에 특화시키는 대안으로 미세 조정이 부상했습니다. 본 논문에서는 단일 GPU에서 정확도와 실행 성능을 이해하기 위해 희소 Mixture of Experts(MoE) 기반 LLM 미세 조정을 특성화합니다. 평가를 통해 희소 및 밀집 MoE 모델의 훈련 효율성뿐만 아니라 최대 배치 크기, 실행 시간 분해, 종단 간 처리량, GPU 하드웨어 활용률, 로드 분포 등 실행 특성에 대한 독특한 통찰을 제공합니다. 우리의 연구는 LLM 미세 조정 성능 향상을 위해 MoE 계층의 최적화가 중요함을 밝힙니다. 프로파일링 결과를 바탕으로 GPU 아키텍처 및 모델 매개변수를 기반으로 한 분석 모델을 개발 및 검증하여 클라우드에서의 LLM 미세 조정 비용을 추정합니다. 이 모델은 처리량과 훈련 비용을 예측하여 산업 및 학계 실무자가 특정 모델의 미세 조정 비용을 예산하는 데 도움을 줍니다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yuchen Xia

J. Kim

Yuhan Chen

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

LLM 미세 조정의 성능 이해 및 비용 추정

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider