Abstract With the growing prevalence of data-driven decision-making, Text-to-SQL has emerged as a promising solution to lower the barrier to data access by translating natural language queries into executable SQL statements, thereby enhancing user interaction with databases. Despite notable progress driven by deep learning and large language models, significant challenges persist in handling complex queries. This paper presents a comprehensive review of the Text-to-SQL task, structured around two core stages: natural language understanding and natural language translation. Methods are categorized along the technical evolution trajectory into four types: rule-based, machine learning-based, pre-trained language model-based, and large language model-based approaches. Unlike previous surveys, which focus on specific techniques or partial aspects of Text-to-SQL, our work offers a two-stage analytical framework, highlights the impact of large models, and provides a comparative analysis of limitations and trade-offs. Through detailed examination of accuracy, generalization, expressiveness, and computational cost, this survey presents insights into the advantages and disadvantages of each paradigm. Furthermore, the paper summarizes key benchmark datasets and evaluation metrics, and discusses directions to improve the robustness, security, and effectiveness of the existing Text-to-SQL systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mengyi Liu
Xieyang Wang
Jianqiu Xu
Frontiers of Computer Science
Building similarity graph...
Analyzing shared references across papers
Loading...
Liu et al. (Sat,) studied this question.
www.synapsesocial.com/papers/69ada885bc08abd80d5bb8bb — DOI: https://doi.org/10.1007/s11704-025-50592-w