March 14, 2024Open Access

Dial-insight：使用高质量领域特定数据微调大型语言模型，防止能力崩溃

Key Points

Key points are not available for this paper at this time.

Abstract

大型语言模型（LLMs）的效能在很大程度上依赖于基础数据的质量，尤其是在专业领域内。针对将LLMs微调用于特定领域应用时模型泛化能力可能下降的常见挑战，我们提出了一种用于构建生产提示的两阶段方法，旨在生成高质量数据。该方法包括生成一系列涵盖广泛任务且表达丰富多样的提示。此外，我们引入了一种经济高效的多维质量评估框架，以确保生成标注数据的完整性。通过利用包含房地产行业服务提供者与客户交互的数据集，我们证明了数据质量与模型性能之间的正相关关系。值得注意的是，我们的研究结果表明，基于我们提出的方法生成的数据进行微调，可以提升通用LLMs的特定领域能力，同时不损害其整体泛化能力，即使微调仅采用领域特定数据。

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jianwei Sun

Chaoyang Mei

Linlin Wei

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Dial-insight：使用高质量领域特定数据微调大型语言模型，防止能力崩溃

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider