notesum.ai
Published at November 12IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark
cs.CL
cs.AI
cs.LG
I.2.7
Released Date: November 12, 2024
Authors: Kawshik Manikantan1, Makarand Tapaswi1, Vineet Gandhi1, Shubham Toshniwal2
Aff.: 1CVIT, IIIT Hyderabad; 2NVIDIA

| Model | w/o CoT | w/ CoT |
|---|---|---|
| Mistral-7B | 55.3 | 53.8 |
| Llama-3.1-8B | 50.2 | 59.7 |
| GPT-4o-mini | 63.3 | 67.0 |