What question did this study set out to answer?

The main aim is to present and evaluate the Agent Brain memory system designed for autonomous AI, inspired by human cognition.

April 23, 2026Open Access

Agent Brain: A Biologically Inspired Memory System for Autonomous AI Agents — LongMemEval-M Evaluation

Key Points

The main aim is to present and evaluate the Agent Brain memory system designed for autonomous AI, inspired by human cognition.
Describes the design of the Agent Brain memory system integrating multiple cognitive processes.
Evaluated on the LongMemEval-M benchmark with a detailed account of results versus control systems.
Implemented various technical components like named entity recognition and a dream cycle for enhanced performance.
Achieved 71.7% accuracy without consolidation on the LongMemEval-M benchmark and 69.8% with the Dream Cycle enabled.
Compared to a pgvector-only control that reached 72.2 – 73.9%, indicating a minor performance difference under specific conditions.

Abstract

Version 3 — corrects cross-system comparisons from v1/v2. See §0 Changelog in the PDF. This technical report describes Agent Brain, a biologically inspired memory system for autonomous AI agents. In contrast to stateless Large Language Model interactions, Agent Brain provides persistent, weighted, and self-organizing memory that emulates human cognitive processes: perception, storage, retrieval, consolidation, and forgetting. The system integrates eleven successive layers including a Perception Gate, Deduplication Guard, typed memory storage (episodic/semantic/procedural), Named Entity Recognition (flair/ner-german-large, F1 92.31%), a Knowledge Graph, LLM-based Query Expansion, Hybrid Search via Reciprocal Rank Fusion, Cross-Encoder Re-Ranking, an implicit Feedback Loop based on the Free Spaced Repetition Scheduler (FSRS), a nightly five-phase Dream Cycle, and complete Workspace Isolation with Row-Level Security. Evaluation on LongMemEval-M. On the public weaviate/longmemeval-m-cleaned benchmark (500 QA pairs across 510 multi-turn workspaces, GPT-4o judge), Agent Brain achieves 71.7% accuracy without consolidation and 69.8% with the Dream Cycle enabled. Our own pgvector-only control reaches 72.2 – 73.9%, which we report transparently as a 2.2 pp gap versus our hybrid pipeline on quiz-style questions. To our knowledge these are the first published numbers on the m-cleaned variant; peer numbers from Zep, Mem0, LangMem, and OpenAI Memory exist only on the LongMemEval-S variant and are therefore not directly comparable. §15 discusses what is and is not known about cross-system ranking on this benchmark. What changed vs v2: v2 contained two errors that are corrected here — the "Zep 63.8%" figure was the baseline row of Rasmussen et al. 2025 Table 2 (not Zep itself; Zep’s reported score on LongMemEval-S with gpt-4o-mini is 71.2%), and cross-system comparisons mixed LongMemEval-S peer numbers with our LongMemEval-M result. v3 removes the "state-of-the-art" claim, reports 71.7% as a single-system self-report on a clearly specified variant, and explicitly abstains from cross-system ranking until peers are re-evaluated on m-cleaned under identical judging. The system has been in production use since early 2026 for Swiss property management (Immobilienbewirtschaftung) with over 5,000 memories, 10,000 entities, and eight specialized agents. Reproducibility: All evaluation scripts, ingestion code, and judge configurations released under MIT license at github.com/AgentBrainHQ/agentbrain-benchmarks.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Theshoth Sritharan

Actions

Institutions

Goldman Sachs (United States)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Agent Brain: A Biologically Inspired Memory System for Autonomous AI Agents — LongMemEval-M Evaluation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study