notesum.ai
Published at December 6A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges
cs.AI
cs.DB
Released Date: December 6, 2024
Authors: Aditi Singh1, Akash Shetty1, Abul Ehtesham2, Saket Kumar3, Tala Talaei Khoei4
Aff.: 1Cleveland State University; 2The Davey Tree Expert Company; 3The Mathworks; 4Khoury College of Computer Science, Roux Institute at Northeastern University

| Model Name | Dataset | Training Method | Accuracy |
|---|---|---|---|
| Seq2SQL | WikiSQL | Seq-to-Seq with Reinforcement Learning | 59.4% |
| SQLNet | WikiSQL | Sketch-Based with Column Attention | 63.2% |
| TypeSQL | WikiSQL | Type-Aware Neural Network | 82.6% |
| IRNet | Spider | Graph Encoder + Intermediate Representation | 61.9% |
| T5-3B | Spider, CoSQL | Fine-Tuned Transformer | 70.0% |
| PICARD + T5-3B | CoSQL | Constrained Decoding for Dialogue-Based SQL Generation | High |
| RASAT+PICARD | CoSQL | Relation-Aware Self-Attention-augmented T5 with Incremental Parsing | 37.4% IEX |
| MedT5SQL | MIMICSQL | BERT-based Encoder with LSTM Decoder for SQL Translation | High Accuracy in Medical Query Translation |
| EDU-T5 | Custom Educational Dataset | Fine-tuned T5 Model with Cross-Attention for SQL Query Generation | Optimized |
| RAT-SQL | WikiSQL, Spider | Relation-Aware Transformer | 69.7% |
| SQLova | WikiSQL | BERT + Column Attention | 95% |
| X-SQL | WikiSQL | BERT-style pre-training with context | 91.8% |
| EHRSQL | EHRSQL Benchmark | Benchmark Model for EHRs | N/A |