notesum.ai
Published at November 29In-Context Learning with Noisy Labels
cs.CL
Released Date: November 29, 2024
Authors: Junyong Kang1, Donghyun Son, Hwanjun Song2, Buru Chang
Aff.: 1KAIST Seoul, South Korea; 2KAIST Daejeon, South Korea

| Model | Retriever | Method | MRPC | SST-5 | Tweet | |||||||||||||||
| 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0 | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | |||
| GPT2-Neo | EPR | w/o manipulation | 78.4 | 76.0 | 75.5 | 67.9 | 47.8 | 51.5 | 46.9 | 44.1 | 41.6 | 38.5 | 36.0 | 31.8 | 58.4 | 59.7 | 55.5 | 54.7 | 53.6 | 47.4 |
| Correction | 72.1 | 38.9 | 57.3 | |||||||||||||||||
| Weighting | 78.2 | 77.2 | 74.8 | 70.8 | 53.4 | 58.3 | 48.5 | 46.8 | 44.5 | 42.5 | 39.2 | 35.9 | 58.1 | 61.4 | 57.2 | 55.4 | 54.0 | 46.8 | ||
| Reordering | 79.4 | 68.1 | 62.0 | 49.0 | 35.0 | 36.8 | 46.6 | 46.3 | 45.3 | 42.5 | 38.5 | 35.9 | 57.7 | 58.3 | 57.5 | 58.7 | 58.0 | 55.4 | ||
| Selection | 56.4 | 49.3 | 43.6 | 40.7 | 35.0 | 36.3 | 45.6 | 44.7 | 44.4 | 43.2 | 43.2 | 38.5 | 57.3 | 57.3 | 57.3 | 57.3 | 57.3 | 56.4 | ||
| Rectification | 78.4 | 77.7 | 77.2 | 77.2 | 75.5 | 76.2 | 44.7 | 44.8 | 44.4 | 45.3 | 45.5 | 44.5 | 58.3 | 58.3 | 58.3 | 58.4 | 58.5 | 58.7 | ||
| TopK- BERT | w/o manipulation | 70.3 | 65.4 | 62.5 | 61.0 | 53.4 | 52.5 | 35.6 | 33.6 | 31.7 | 33.0 | 29.0 | 26.4 | 65.7 | 63.3 | 60.1 | 58.8 | 52.8 | 50.3 | |
| Correction | 67.0 | 33.8 | 57.3 | |||||||||||||||||
| Weighting | 71.6 | 69.1 | 67.2 | 65.4 | 56.4 | 56.4 | 36.9 | 36.5 | 35.5 | 32.7 | 33.9 | 30.8 | 65.3 | 63.8 | 59.6 | 57.0 | 50.2 | 48.7 | ||
| Reordering | 52.7 | 46.6 | 42.2 | 40.2 | 34.6 | 34.3 | 36.4 | 36.1 | 35.1 | 33.8 | 31.9 | 29.1 | 69.7 | 66.1 | 62.7 | 60.4 | 56.4 | 58.5 | ||
| Selection | 44.6 | 41.4 | 40.0 | 35.5 | 33.8 | 32.8 | 34.3 | 33.8 | 34.9 | 33.2 | 32.4 | 32.9 | 61.5 | 60.6 | 59.9 | 59.5 | 56.3 | 57.7 | ||
| Rectification | 71.1 | 71.1 | 70.3 | 71.3 | 68.2 | 67.4 | 36.6 | 36.8 | 36.0 | 35.6 | 36.0 | 36.2 | 64.4 | 64.2 | 63.7 | 63.1 | 63.5 | 64.0 | ||
| Llama2-7B | EPR | w/o manipulation | 77.7 | 75.5 | 76.5 | 74.3 | 62.3 | 62.3 | 50.2 | 49.3 | 51.0 | 47.6 | 45.3 | 45.4 | 59.5 | 61.5 | 58.6 | 60.0 | 57.5 | 52.9 |
| Correction | 72.5 | 43.7 | 57.3 | |||||||||||||||||
| Rectification | 78.2 | 78.4 | 78.4 | 78.4 | 76.7 | 75.8 | 50.3 | 50.5 | 50.3 | 50.2 | 50.6 | 49.4 | 59.9 | 59.8 | 59.9 | 59.6 | 60.3 | 59.4 | ||
| TopK- BERT | w/o manipulation | 70.3 | 70.6 | 69.1 | 66.9 | 59.1 | 60.0 | 49.8 | 47.2 | 49.0 | 47.3 | 43.5 | 40.4 | 71.4 | 69.7 | 67.5 | 64.7 | 56.9 | 55.8 | |
| Correction | 69.6 | 46.3 | 57.4 | |||||||||||||||||
| Rectification | 73.5 | 71.6 | 72.3 | 70.6 | 71.3 | 71.1 | 51.2 | 50.8 | 51.1 | 50.2 | 50.6 | 49.4 | 71.7 | 71.7 | 71.3 | 69.7 | 71.3 | 70.3 | ||
| Mistral-7B | EPR | w/o manipulation | 77.7 | 77.0 | 76.5 | 76.7 | 61.3 | 64.0 | 51.9 | 50.1 | 48.5 | 45.5 | 44.3 | 40.0 | 66.3 | 68.2 | 63.3 | 65.1 | 61.8 | 55.8 |
| Correction | 74.5 | 44.8 | 57.3 | |||||||||||||||||
| Rectification | 78.4 | 77.9 | 77.9 | 78.9 | 77.0 | 77.0 | 49.4 | 49.9 | 49.5 | 49.9 | 49.8 | 49.8 | 66.0 | 65.8 | 65.8 | 65.8 | 65.8 | 65.8 | ||
| TopK- BERT | w/o manipulation | 72.3 | 72.1 | 70.3 | 67.9 | 62.3 | 62.3 | 47.7 | 46.4 | 44.8 | 42.7 | 40.3 | 39.2 | 74.6 | 71.5 | 67.5 | 65.9 | 57.5 | 57.5 | |
| Correction | 71.6 | 43.6 | 57.2 | |||||||||||||||||
| Rectification | 73.0 | 72.3 | 72.5 | 71.8 | 71.8 | 71.6 | 49.0 | 48.7 | 49.7 | 49.1 | 50.2 | 48.4 | 72.1 | 71.5 | 71.4 | 70.2 | 70.3 | 69.1 | ||