August 25, 2024Open Access

聚焦的大型语言模型是稳定的多示例学习者

Key Points

Key points are not available for this paper at this time.

Abstract

上下文学习（ICL）使大型语言模型（LLMs）能够通过示范实现快速的任务适应。随着LLMs可用上下文长度的增加，近期实验显示ICL在多示例（示范）设置中的表现不一定能够很好地扩展。我们理论和实验上确认，原因在于更多示范分散了模型对查询的注意力，阻碍了对关键内容的理解。受人类如何通过例子学习的启发，我们提出了一种无需训练的方法FocusICL，该方法通过令牌级的琐碎过滤避免注意力被不重要内容分散，并在示范级别上运用分层注意力，进一步确保对当前查询的充分关注。我们还设计了一种基于示范模型困惑度的高效超参数搜索策略。全面实验验证FocusICL在平均表现上较传统ICL提升了5.2%，并且能够很好地扩展到多示范场景。

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Peiwen Yuan

Shaoxiong Feng

Yiwei Li

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

聚焦的大型语言模型是稳定的多示例学习者

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider