notesum.ai
Published at December 10OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
cs.CV
cs.AI
cs.IR
Released Date: December 10, 2024
Authors: Linke Ouyang1, Yuan Qu, Hongbin Zhou, Jiawei Zhu, Rui Zhang, Qunshu Lin, Bin Wang, Zhiyuan Zhao, Man Jiang, Xiaomeng Zhao, Jin Shi, Fan Wu, Pei Chu, Minghao Liu, Zhenxiang Li, Chao Xu, Bo Zhang, Botian Shi, Zhongying Tu, Conghui He
Aff.: 1Shanghai AI Laboratory

| Method Type | Methods | TextEdit | FormulaEdit | FormulaCDM | TableTEDS | TableEdit | Read OrderEdit | OverallEdit | |||||||
| EN | ZH | EN | ZH | EN | ZH | EN | ZH | EN | ZH | EN | ZH | EN | ZH | ||
| Pipeline Tools | MinerU | 0.058 | 0.211 | 0.278 | 0.577 | 66.9 | 49.5 | 79.4 | 62.7 | 0.305 | 0.461 | 0.079 | 0.288 | 0.180 | 0.384 |
| Marker | 0.141 | 0.303 | 0.667 | 0.868 | 18.4 | 12.7 | 54.0 | 45.8 | 0.718 | 0.763 | 0.138 | 0.306 | 0.416 | 0.560 | |
| Mathpix | 0.101 | 0.358 | 0.306 | 0.454 | 71.4 | 72.7 | 77.9 | 68.2 | 0.322 | 0.416 | 0.105 | 0.275 | 0.209 | 0.376 | |
| Expert VLMs | GOT-OCR | 0.187 | 0.315 | 0.360 | 0.528 | 81.8 | 51.4 | 53.5 | 48.0 | 0.521 | 0.594 | 0.141 | 0.28 | 0.302 | 0.429 |
| Nougat | 0.365 | 0.998 | 0.488 | 0.941 | 17.4 | 16.9 | 40.3 | 0.0 | 0.622 | 1.000 | 0.382 | 0.954 | 0.464 | 0.973 | |
| General VLMs | GPT4o | 0.144 | 0.409 | 0.425 | 0.606 | 76.4 | 48.2 | 72.8 | 63.7 | 0.363 | 0.474 | 0.128 | 0.251 | 0.265 | 0.435 |
| Qwen2-VL | 0.252 | 0.251 | 0.468 | 0.572 | 54.9 | 60.9 | 59.9 | 66.8 | 0.591 | 0.587 | 0.255 | 0.223 | 0.392 | 0.408 | |
| InternVL2 | 0.353 | 0.290 | 0.543 | 0.701 | 69.8 | 49.6 | 63.8 | 61.1 | 0.616 | 0.638 | 0.317 | 0.228 | 0.457 | 0.464 | |