February 2, 2019Open Access

面向NLP的参数高效迁移学习

Key Points

Key points are not available for this paper at this time.

Abstract

微调大型预训练模型是NLP中一种有效的迁移机制。然而，对于众多下游任务，微调在参数使用上效率较低：每个任务都需要一个全新的模型。作为替代方案，我们提出使用适配器模块进行迁移。适配器模块能够构建紧凑且可扩展的模型；每个任务仅需增加少量可训练参数，且新任务可以在不重新训练先前任务的情况下添加。原始网络的参数保持固定，实现了高度的参数共享。为展示适配器的有效性，我们将近期提出的BERT Transformer模型迁移到26个多样化的文本分类任务中，包括GLUE基准测试。适配器几乎达到最新技术水平的性能，同时每个任务仅增加少量参数。在GLUE上，我们的性能仅比全微调低0.4%，而且每个任务仅增加3.6%的参数。相比之下，微调需要针对每个任务训练100%的参数。

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Houlsby等人（周六）研究了这个问题。

www.synapsesocial.com/papers/6a0947ef0e219f8cdd33f325 — DOI: https://doi.org/10.48550/arxiv.1902.00751

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

In Defense of the Triplet Loss for Person Re-Identification· 2017 · 2,895 citations
Contributions to the study of SMS spam filtering· 2011 · 443 citations
BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning· 2019 · 113 citations
Multitask Learning· 1997 · 6,236 citations
NewsWeeder: Learning to Filter Netnews· 1995 · 2,046 citations

Authors

Neil Houlsby

Andrei Giurgiu

Stanisław Jastrzȩbski

Actions

Institutions

Université de Montréal

Google (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

面向NLP的参数高效迁移学习

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion