We present a two-phase sleep architecture for memory consolidation in language models, with slow-wave sleep (SWS) for individual fact consolidation via per-fact LoRA training, and REM sleep for knowledge integration via synthetic multi-fact conversations. We introduce per-fact staged consolidation where each fact independently advances through stages (0-3) based on individual chat recall testing, replacing all-or-nothing per-edit gating. Key findings: MEMIT achieves near-zero perplexity cost for fact injection; REM reduces SWS-induced perplexity damage by 88% at 3B; per-fact gating achieves 95% consolidation success at 8B; and we discover pathway separation where MEMIT edits the raw completion pathway while LoRA edits the chat pathway. We validate across 3B, 8B, and 70B models, demonstrating that the graduated MEMIT dissolution schedule (scale 1.0 -> 0.5 -> 0.1 -> 0.0) successfully transfers knowledge from MEMIT to LoRA.
Building similarity graph...
Analyzing shared references across papers
Loading...
Vladimir Baranov (Wed,) studied this question.
www.synapsesocial.com/papers/69a286c90a974eb0d3c02042 — DOI: https://doi.org/10.5281/zenodo.18778765
Vladimir Baranov
Building similarity graph...
Analyzing shared references across papers
Loading...