February 23, 2024Open Access

MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models

Key Points

Key points are not available for this paper at this time.

Abstract

Transformer-based language models (LMs) track contextual information through large, hard-coded input windows. We introduce MemoryPrompt, a leaner approach in which the LM is complemented by a small auxiliary recurrent network that passes information to the LM by prefixing its regular input with a sequence of vectors, akin to soft prompts, without requiring LM finetuning. Tested on a task designed to probe a LM's ability to keep track of multiple fact updates, a MemoryPrompt-augmented LM outperforms much larger LMs that have access to the full input history. We also test MemoryPrompt on a long-distance dialogue dataset, where its performance is comparable to that of a model conditioned on the entire conversation history. In both experiments we also observe that, unlike full-finetuning approaches, MemoryPrompt does not suffer from catastrophic forgetting when adapted to new tasks, thus not disrupting the generalist capabilities of the underlying LM.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Rakotonirina et al. (Fri,) studied this question.

www.synapsesocial.com/papers/68e77f50b6db6435876f2dda — DOI: https://doi.org/10.48550/arxiv.2402.15268

Authors

Nathanaël Carraz Rakotonirina

Marco Baroni

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Also consider