This paper addresses the challenge of providing operational access to current metadata in complex, ever-changing relational data warehouses. Traditional catalogs struggle to keep up with changes in schemas, code, and processes. The paper presents a methodological approach based on a dual-loop architecture with ReAct agents and retrieval-augmented generation. The first loop, managed by an Ingestion Agent, continuously updates the semantic layer by automatically analyzing changes. The second loop uses an Assistant Agent to give analysts, developers, and support engineers an intelligent interface. This interface combines semantic search over a vector database with direct execution of diagnostic queries through an extensible set of tools. The main goal is to create a self-updating metadata ecosystem that provides operational access to contextual information for different user groups. The approach’s practical effectiveness is demonstrated through end-to-end scenarios, such as creating complex queries based on business terms or diagnosing extract-transform-load processes.
Martynov et al. (Wed,) studied this question.