notesum.ai
Published at December 5M$^{3}$D: A Multimodal, Multilingual and Multitask Dataset for Grounded Document-level Information Extraction
cs.CL
Released Date: December 5, 2024
Authors: Jiang Liu1, Bobo Li1, Xinran Yang1, Na Yang1, Hao Fei2, Mingyao Zhang1, Fei Li3, Donghong Ji1
Aff.: 1Wuhan University; 2National University of Singapore; 3Wuxi, China

| Train | Dev | Test | |||||
| EN | # Doc. | 1,644 | 205 | 207 | |||
| # Ent. | 27,843 | 3,457 | 3,587 | ||||
| # Rel. | 11,410 | 1,388 | 1,364 | ||||
| # Cha. | 15,031 | 1,885 | 1,882 | ||||
| # Gro. | 9,812 | 1,171 | 1,198 | ||||
| \hdashlineZH | # Doc. | 1,629 | 203 | 205 | |||
| # Ent. | 22,043 | 2,685 | 2,728 | ||||
| # Rel. | 8,372 | 985 | 1,049 | ||||
| # Cha. | 11,481 | 1,397 | 1,419 | ||||
| # Gro. | 3,726 | 505 | 482 |