February 9, 2024Open Access

大型语言模型：综述

Key Points

Key points are not available for this paper at this time.

Abstract

自2022年11月ChatGPT发布以来，大型语言模型（LLMs）因其在广泛自然语言任务上的强大表现而备受关注。LLMs通过在海量文本数据上训练数十亿参数，实现了通用语言理解和生成能力，这与扩展定律kaplan2020scaling、hoffmann2022training的预测一致。LLMs这一研究领域虽非常新颖，但正在以多种不同方式迅速发展。本文回顾了一些最著名的LLMs，包括三大流行LLM系列（GPT、LLaMA、PaLM），并讨论了它们的特点、贡献及局限。同时，我们概述了构建和增强LLMs的相关技术，调查了用于LLM训练、微调和评估的热门数据集，回顾了广泛使用的LLM评估指标，并在一组代表性基准上比较了几款流行LLM的性能。最后，本文通过讨论尚存的挑战及未来研究方向作结。

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Shervin Minaee

Tomas Mikolov

Narjes Nikzad-Khasmakhi

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

大型语言模型：综述

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study