notesum.ai
Published at November 14VCBench: A Controllable Benchmark for Symbolic and Abstract Challenges in Video Cognition
cs.CV
cs.AI
Released Date: November 14, 2024
Authors: Chenglin Li, Qianglong Chen, Zhi Li, Feng Tao, Yin Zhang

| Method | OP | AP | TR | SR | GP | FP | Avg. | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| S1 | S2 | S3 | S4 | S5 | S6 | S7 | S8 | S9 | S10 | ||
| Random | 33.2 | 34.0 | 37.1 | 30.3 | 32.7 | 23.9 | 25.0 | 28.2 | 37.6 | 33.9 | 31.6 |
| MiniCPM-V | 28.2 | 49.5 | 39.3 | 47.8 | 32.2 | 34.7 | 28.9 | 26.7 | 46.0 | 54.4 | 38.8 |
| Video-LLaMA2 | 31.3 | 50.5 | 33.5 | 48.3 | 36.4 | 18.7 | 26.7 | 27.8 | 52.0 | 52.2 | 37.7 |
| InternVideo2 | 31.3 | 50.5 | 33.5 | 48.3 | 36.4 | 18.7 | 26.7 | 27.8 | 52.0 | 52.2 | 37.7 |
| Video-LLaVA | 40.4 | 21.0 | 40.8 | 23.2 | 37.5 | 21.3 | 16.7 | 25.5 | 38.0 | 60.0 | 32.4 |
| LLaVA-NEXT-Video-7B | 20.4 | 22.5 | 30.7 | 21.0 | 33.8 | 18.7 | 12.2 | 14.4 | 15.3 | 46.7 | 23.6 |
| LLaVA-NEXT-Video-34B | 28.4 | 42.0 | 42.7 | 39.0 | 22.9 | 58.0 | 37.8 | 12.2 | 33.3 | 58.9 | 37.5 |
| InternLM-XComposer-2.5 | 36.0 | 38.2 | 45.5 | 44.5 | 43.1 | 20.0 | 8.9 | 25.6 | 35.3 | 61.1 | 35.8 |
| Qwen2-VL-2B | 32.7 | 39.8 | 29.8 | 37.8 | 30.7 | 28.0 | 32.2 | 20.0 | 34.7 | 33.3 | 31.9 |
| Qwen2-VL-7B | 41.6 | 45.7 | 42.8 | 50.5 | 38.2 | 36.0 | 36.7 | 34.4 | 46.7 | 52.2 | 42.5 |
| Qwen2-VL-72B | 51.8 | 58.2 | 60.0 | 56.8 | 60.7 | 42.0 | 32.2 | 37.8 | 62.0 | 76.7 | 53.7 |