notesum.ai

Published at October 30

BIS: NL2SQL Service Evaluation Benchmark for Business Intelligence Scenarios

cs.AI

Released Date: October 30, 2024

Authors: Bora Caglayan¹, Mingxue Wang¹, John D. Kelleher², Shen Fei³, Gui Tong³, Jiandong Ding³, Puchao Zhang¹

Aff.: ¹Huawei Ireland Research Centre, Dublin, Ireland; ²ADAPT Research Centre, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland; ³Huawei Technologies Co., Ltd.

Arxiv: http://arxiv.org/abs/2410.22925v1

Benchmark	Definition	BI Question Categories	SQL Query Evaluation Metrics	SQL Result Evaluation Metrics	Question #
WikiSql [13]	Generated questions from Wikipedia	✗	SQL logical form exact match	Result exact match	80645
Spider [12]	Student generated database covering a wide range of domains	✗	SQL component exact match	Result exact match	10181
Yelp & IMDB [11]	Yelp website and the Internet Movie Database	✗	✗	✗	259
MAS [6]	Microsoft academic search database	✗	✗	✗	196
Advising [3]	Course information	✗	SQL component exact match	✗	4579
CSpider [8]	Spider Benchmark for Chinese	✗	SQL component exact match	✗	9691
BIS (Our benchmark)	Benchmark with temporal information for BI applications	✓	SQL semantic similarity	Result partial similarity	239