Language models that learn from conversation via direct weight editing (MEMIT) face a hard capacity ceiling: the 8B Llama model sustains reliable recall for only ~13 unconstrained edits before cascading interference collapses performance. Prior attempts to offload knowledge into LoRA adapters failed: the alignment tax (37% recall degradation on 8B) blocks the transfer pathway, and per-edit gating produced 0% advancement. We resolve both failures with per-fact graduated consolidation: each fact independently tracks its consolidation stage, a graduated dissolution schedule (1. 0 -> 0. 5 -> 0. 1 -> 0. 0) progressively reduces MEMIT influence, and cumulative fusing -- training each cycle on an already-fused model -- overcomes the alignment tax through incremental prior erosion. In a capacity sweep on Llama 3. 1 8B (4-bit, 2xH100) with 5, 10, 15, 20 facts across 3 sleep cycles, every condition achieves 100% advancement rate and 1. 00 chat recall. MEMIT edits dissolve as designed, making the buffer renewable: effective lifetime capacity becomes unbounded. This is Paper 6 in the Sleeping LLM series, superseding the MEMIT-only architecture of Paper 5.
Building similarity graph...
Analyzing shared references across papers
Loading...
Vladimir Baranov (Wed,) studied this question.
www.synapsesocial.com/papers/69a286b80a974eb0d3c01dcb — DOI: https://doi.org/10.5281/zenodo.18779159
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Vladimir Baranov
Building similarity graph...
Analyzing shared references across papers
Loading...