notesum.ai
Published at October 31'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue
cs.CL
cs.AI
cs.LG
cs.MM
Released Date: October 31, 2024
Authors: Rena Gao1, Xuetong Wu1, Siwen Luo2, Caren Han1, Feng Liu1
Aff.: 1The University of Melbourne; 2The University of Western Australia

| FPR95 / AUROC / AUPR | ||||
| OOD Scores | Aggregation | Baseline w/ OOD Scores | DIAEF | |
| Image | Dialogue | w/ OOD Scores | ||
| MSP | Max | 84.4/ 64.8/ 49.0 | 76.9/ 66.5/ 48.8 | 73.4/ 73.2/ 53.5 |
| Prob | Max | 60.0 / 75.6 / 57.9 | 67.9 / 73.5 / 56.1 | 55.3 / 78.8 / 57.9 |
| Sum | 70.7 / 68.3 / 49.0 | 91.9 / 62.3 / 45.7 | 72.8 / 73.6 / 56.6 | |
| Logits | Max | 60.0 / 75.6 / 57.9 | 67.9 / 73.5 / 56.1 | 57.2 / 82.6 / 72.7 |
| Sum | 91.2 / 59.2 / 43.6 | 98.6 / 44.1 / 36.0 | 97.2 / 49.9 / 37.4 | |
| ODIN | Max | 59.1 / 75.4 / 57.6 | 72.1 / 73.2 / 55.5 | 59.6 / 78.9 / 58.8 |
| Sum | 71.2 / 68.0 / 48.8 | 91.9 / 61.6 / 45.2 | 73.0 / 73.2 / 56.0 | |
| Mahalanobis | Max | 49.2 / 81.3 / 62.9 | 66.0 / 75.8 / 56.8 | 49.7 / 83.2 / 67.1 |
| Sum | 88.5 / 75.5 / 57.5 | 78.6 / 68.6 / 50.0 | 75.0 / 76.2 / 60.2 | |
| JointEnergy | Max | 60.0 / 75.6 / 57.9 | 67.9 / 73.5 / 56.1 | 57.6 / 82.5 / 72.6 |
| Sum | 58.3 / 75.8 / 58.0 | 67.0 / 74.1 / 57.1 | 55.9 / 82.3 / 72.2 | |
| Average | Max | 62.1 / 74.7 / 57.2 | 69.8 / 72.7 / 54.9 | 58.8 / 79.9 / 63.8 |
| Sum | 76.0 / 69.4 / 51.4 | 85.6 / 62.1 / 46.8 | 74.8 / 71.0 / 56.5 | |