What question did this study set out to answer?

This review aims to analyze the Text-to-SQL task, focusing on its methodologies and challenges in natural language processing.

March 8, 2026Open Access

A systematic review of natural language interfaces for databases

Key Points

This review aims to analyze the Text-to-SQL task, focusing on its methodologies and challenges in natural language processing.
Conducted a comprehensive review of Text-to-SQL approaches.
Categorized methods into rule-based, machine learning, pre-trained, and large language model-based.
Analyzed frameworks emphasizing natural language understanding and translation.
Examined key benchmark datasets and evaluation metrics.
Identified advantages and limitations of various Text-to-SQL methodologies.
Highlighted the impact of large language models on system effectiveness.
Discussed challenges in handling complex queries and suggested directions for improvement.

Abstract

Abstract With the growing prevalence of data-driven decision-making, Text-to-SQL has emerged as a promising solution to lower the barrier to data access by translating natural language queries into executable SQL statements, thereby enhancing user interaction with databases. Despite notable progress driven by deep learning and large language models, significant challenges persist in handling complex queries. This paper presents a comprehensive review of the Text-to-SQL task, structured around two core stages: natural language understanding and natural language translation. Methods are categorized along the technical evolution trajectory into four types: rule-based, machine learning-based, pre-trained language model-based, and large language model-based approaches. Unlike previous surveys, which focus on specific techniques or partial aspects of Text-to-SQL, our work offers a two-stage analytical framework, highlights the impact of large models, and provides a comparative analysis of limitations and trade-offs. Through detailed examination of accuracy, generalization, expressiveness, and computational cost, this survey presents insights into the advantages and disadvantages of each paradigm. Furthermore, the paper summarizes key benchmark datasets and evaluation metrics, and discusses directions to improve the robustness, security, and effectiveness of the existing Text-to-SQL systems.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Mengyi Liu

Xieyang Wang

Jianqiu Xu

Journals

Frontiers of Computer Science

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A systematic review of natural language interfaces for databases

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study