July 1, 2024Open Access

Memory³: Language Modeling with Explicit Memory

Key Points

Achieving better performance than larger models indicates the effectiveness of explicit memory in language models.
A 2.4B language model yields superior results, outperforming traditional retrieval-augmented generation approaches.
Assessment using a memory circuitry theory highlights explicit memory's role in optimizing parameter sizes and costs for language models today and in future designs. The findings suggest that retaining core knowledge externally can vastly streamline model efficiency and speed.

Abstract

The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human brain, we reduce this cost by equipping LLMs with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowledge externalized to explicit memories, the LLM can enjoy a smaller parameter size, training cost, and inference cost, all proportional to the amount of remaining "abstract knowledge". As a preliminary proof of concept, we train from scratch a 2. 4B LLM, which achieves better performance than much larger LLMs as well as RAG models, and maintains higher decoding speed than RAG. The model is named Memory³, since explicit memory is the third form of memory in LLMs after implicit memory (model parameters) and working memory (context key-values). We introduce a memory circuitry theory to support the externalization of knowledge, and present novel techniques including a memory sparsification mechanism that makes storage tractable and a two-stage pretraining scheme that facilitates memory formation.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Yang et al. (Mon,) studied this question.

www.synapsesocial.com/papers/68e61f51b6db6435875b1bc3 — DOI: https://doi.org/10.48550/arxiv.2407.01178

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory· 2024 · 2 citations
Cognitive Memory in Large Language Models· 2025
Needle in the Haystack for Memory Based Large Language Models· 2024 · 5 citations
Theoretical Foundations for Memory-Hierarchical Local Inference of Large Language Models· 2026
MemVault: A Three-Layer Hierarchical Memory Management System for Cost-Optimized LLM Applications

Authors

Hongkang Yang

Zehao Lin

Wenjin Wang

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Memory³: Language Modeling with Explicit Memory

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion