February 2, 2019Open Access

एनएलपी के लिए पैरामीटर-कुशल ट्रांसफर लर्निंग

Key Points

Key points are not available for this paper at this time.

Abstract

बड़े प्री-ट्रेंड मॉडल्स को फाइन-ट्यून करना एनएलपी में एक प्रभावी ट्रांसफर तंत्र है। हालांकि, कई डाउनस्ट्रीम कार्यों की उपस्थिति में, फाइन-ट्यूनिंग पैरामीटर के हिसाब से अप्रभावी होती है: हर कार्य के लिए एक नया मॉडल आवश्यक होता है। इसके विकल्प के रूप में, हम एडाप्टर मॉड्यूल के साथ ट्रांसफर प्रस्तावित करते हैं। एडाप्टर मॉड्यूल एक कॉम्पैक्ट और विस्तार योग्य मॉडल प्रदान करते हैं; वे प्रति कार्य केवल कुछ ट्रेन योग्य पैरामीटर जोड़ते हैं, और नए कार्य बिना पिछले कार्यों को फिर से देखे जोड़े जा सकते हैं। मूल नेटवर्क के पैरामीटर स्थिर रहते हैं, जिससे पैरामीटर साझा करने की उच्च डिग्री मिलती है। एडाप्टर की प्रभावशीलता दिखाने के लिए, हमने हाल ही में प्रस्तावित BERT ट्रांसफॉर्मर मॉडल को 26 विविध टेक्स्ट क्लासिफिकेशन कार्यों पर ट्रांसफर किया, जिसमें GLUE बेंचमार्क भी शामिल है। एडाप्टर्स लगभग राज्य-के-कल्याणकारी प्रदर्शन प्राप्त करते हैं, जबकि प्रति कार्य केवल कुछ पैरामीटर जोड़ते हैं। GLUE पर, हम पूर्ण फाइन-ट्यूनिंग के प्रदर्शन के 0.4% के भीतर पहुँचते हैं, प्रति कार्य केवल 3.6% पैरामीटर जोड़ते हुए। इसके विपरीत, फाइन-ट्यूनिंग प्रति कार्य 100% पैरामीटर ट्रेन करता है।

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Neil Houlsby

Andrei Giurgiu

Stanisław Jastrzȩbski

Actions

Institutions

Université de Montréal

Google (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Houlsby et al. (Sat,) ने इस प्रश्न का अध्ययन किया।

www.synapsesocial.com/papers/6a0947ef0e219f8cdd33f325 — DOI: https://doi.org/10.48550/arxiv.1902.00751

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

In Defense of the Triplet Loss for Person Re-Identification· 2017 · 2,895 citations
Contributions to the study of SMS spam filtering· 2011 · 443 citations
BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning· 2019 · 113 citations
Multitask Learning· 1997 · 6,236 citations
NewsWeeder: Learning to Filter Netnews· 1995 · 2,046 citations

एनएलपी के लिए पैरामीटर-कुशल ट्रांसफर लर्निंग

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider