This thesis develops a SQL-to-MongoDB translation algorithm using a two stage Text-to-NoSQL pipeline. The translator, embedded in the UnityJDBC middleware, converts SQL produced by Text-to-SQL models into MongoDB Query Language (MQL) executable against a MongoDB instance. This work extends the baseline translator, which handled only simple SELECT WHERE–LIMIT queries, with native support for GROUP BY, HAVING, DISTINCT combined with ORDER BY, set operations (UNION, INTER SECT, EXCEPT), IN/NOT IN with correlated subqueries, scalar subqueries, arithmetic and string-concatenation expressions in projections, and JOIN support with correct field resolution. Cross-cutting fixes address case-insensitive schema lookup, cross-branch table resolution for set operations, and a planner NPE, together with two SQL preprocessing passes that normalise double quoted string literals and quote the reserved identifier Rank before the SQL reaches UnityJDBC. On the TEND benchmark of 2775 SQL–MQL pairs derived from Spider and BIRD LSQ⁺25, using SQL predictions from DAIL SQL GWL⁺23, these changes reduced translation failures from 1021 to 32, a 96.9% reduction, and improved the translation success rate from 63.2% to 98.9% with zero regressions. Execution accuracy on the same benchmark rose from 18.0% to 40.8%, nearly four times the 10.8% reported for the rule-based Grammar Converter in Lu et al. LSQ⁺25 and within 4.0 percentage points of their LLM-based SQL-to-MQL converter, at a fraction of the inference cost
Building similarity graph...
Analyzing shared references across papers
Loading...
Tahsin Jawwad
Building similarity graph...
Analyzing shared references across papers
Loading...
Tahsin Jawwad (Thu,) studied this question.
www.synapsesocial.com/papers/69fd7fa1bfa21ec5bbf0833a — DOI: https://doi.org/10.14288/1.0452105