notesum.ai
Published at October 23Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validation
quant-ph
cs.CC
cs.DS
Released Date: October 23, 2024
Authors: Suho Kang1, Jungyang Park1, Joonseo Ha1, SoMin Kim1, JinHyeong Kim1, Subeen Park1, Kyungwoo Song1
Aff.: 1Yonsei University

| Length of Article | Model | Q1 | Q2 | Q3 | Q4 | Q5 | |
|---|---|---|---|---|---|---|---|
| Not The Onion | Acc.(%) | Gemini-1.5-Pro | 56.15 | 80.69 | 86.63 | 91.58 | 92.07 |
| GPT-4o | 84.23 | 90.42 | 91.25 | 96.25 | 96.68 | ||
| Onion | Acc.(%) | Gemini-1.5-Pro | 96.51 | 96.14 | 100.00 | 100.00 | 100.00 |
| GPT-4o | 91.20 | 96.80 | 100.00 | 100.00 | 100.00 | ||