notesum.ai
Published at November 12Likelihood as a Performance Gauge for Retrieval-Augmented Generation
cs.CL
cs.AI
cs.LG
Released Date: November 12, 2024
Authors: Tianyu Liu1, Jirui Qi2, Paul He3, Arianna Bisazza2, Mrinmaya Sachan1, Ryan Cotterell1
Aff.: 1ETH Zürich; 2University of Groningen; 3University of Toronto
| #Doc | Mistral-7B-Inst | LLaMA-3-8B | LLaMA-3.1-8B | LLaMA-3-8B-Inst | LLaMA-3.1-8B-Inst | MPT-7B-8K-Inst | |
|---|---|---|---|---|---|---|---|
| NQ-Open | |||||||
| 10 | Highest | 68.69 (-2.52) | 54.04 (-1.84) | 56.72 (-2.41) | 71.58 (-1.80) | 66.13 (-2.16) | 48.93 (-2.80) |
| Lowest | 66.98 (-2.89) | 49.30 (-2.03) | 53.29 (-2.72) | 71.29 (-2.01) | 65.70 (-2.43) | 46.97 (-3.38) | |
| 20 | Highest | 64.86 (-2.45) | 52.05 (-1.99) | 52.50 (-2.40) | 69.00 (-1.83) | 62.97 (-2.21) | 42.25 (-2.70) |
| Lowest | 62.60 (-2.83) | 46.91 (-2.03) | 48.51 (-2.72) | 67.68 (-2.01) | 61.05 (-2.43) | 42.09 (-3.23) | |
| 30 | Highest | 57.70 (-2.52) | 50.30 (-1.88) | 50.00 (-2.60) | 64.36 (-1.84) | 60.95 (-2.41) | 39.31 (-2.56) |
| Lowest | 53.96 (-2.92) | 45.27 (-2.03) | 46.42 (-2.83) | 65.12 (-2.03) | 59.55 (-2.65) | 39.12 (-3.05) | |