notesum.ai
Published at November 29TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension
cs.AI
cs.CL
cs.IR
Released Date: November 29, 2024
Authors: Zipeng Qiu1, You Peng1, Guangxin He1, Binhang Yuan1, Chen Wang2
Aff.: 1HKUST; 2Tsinghua University

| Database Name | Source | Table Count | Average Columns | Average Rows | Total Cells |
|---|---|---|---|---|---|
| airline | BIRD | 3 | 10.67 | ||
| food_inspection | BIRD | 3 | 8.33 | ||
| movie | BIRD | 3 | 9.00 | ||
| music_tracker | BIRD | 2 | 5.00 | ||
| restaurant | BIRD | 3 | 4.00 | ||
| university | BIRD | 6 | 3.33 | ||
| cookbook | BIRD | 4 | 9.75 | ||
| food_facility_inspections | DataGov | 3 | 13.67 | ||
| water_quality | DataGov | 4 | 9.75 | ||
| global_biodiversity | WorldBank | 2 | 15.50 | ||
| Overall Average | - | 3.3 | 8.36 |