notesum.ai
Published at November 5Confidence Calibration of Classifiers with Many Classes
cs.LG
cs.AI
Released Date: November 5, 2024
Authors: Adrien LeCoz, Stéphane Herbin1, Faouzi Adjed2
Aff.: 1ONERA - DTIS; 2IRT SystemX

| scaling methods | binary methods | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | Models | Uncal. | IRM | I-Max | TS | TSTvA | VS | VSreg_TvA | DC | DCreg_TvA | Iso | IsoTvA | BBQ | BBQTvA | HB | HBTvA |
| C10 | ConvNets | 1.77 | 0.67 | 0.61 | 1.20 | 1.25 | 1.17 | 1.17 | 1.16 | 1.17 | 1.12 | 0.68 | 1.17 | 0.77 | 1.06 | 0.43 |
| CLIP | 5.03 | 1.03 | 0.94 | 0.78 | 0.73 | 2.56 | 1.85 | 2.56 | 1.84 | 1.05 | 0.86 | 1.39 | 0.86 | 1.82 | 0.73 | |
| C100 | ConvNets | 6.04 | 1.30 | 1.15 | 4.65 | 2.96 | 4.91 | 2.35 | 4.89 | 2.35 | 5.33 | 1.38 | 9.63 | 1.35 | 9.56 | 1.02 |
| CLIP | 10.37 | 2.90 | 2.57 | 2.54 | 2.51 | 7.78 | 2.86 | 7.55 | 1.84 | 2.53 | 1.61 | 7.48 | 1.48 | 7.23 | 1.39 | |
| IN | ResNet | 15.26 | 1.31 | 1.07 | 2.65 | 1.89 | 2.77 | 1.67 | 3.59 | 2.23 | 3.05 | 0.79 | 8.41 | 0.76 | 7.49 | 0.55 |
| EffNet | 15.72 | 0.68 | 0.48 | 3.48 | 2.59 | 3.67 | 1.26 | 3.65 | 1.23 | 2.83 | 0.68 | 6.55 | 0.64 | 4.39 | 0.43 | |
| ConvNeXt | 16.46 | 0.82 | 0.58 | 3.67 | 2.25 | 4.05 | 1.37 | 4.04 | 1.35 | 2.97 | 0.75 | 7.41 | 0.68 | 5.13 | 0.52 | |
| ViT | 4.40 | 0.81 | 0.61 | 4.09 | 2.96 | 4.31 | 2.02 | 4.31 | 1.99 | 3.60 | 0.77 | 6.64 | 0.73 | 6.59 | 0.52 | |
| Swin | 5.85 | 0.75 | 0.49 | 3.63 | 2.91 | 4.04 | 1.70 | 4.03 | 1.67 | 3.19 | 0.74 | 7.09 | 0.71 | 5.39 | 0.48 | |
| CLIP | 1.96 | 1.08 | 0.72 | 1.89 | 1.82 | 1.63 | 1.05 | 32.03 | 67.65 | 2.35 | 0.92 | 8.31 | 0.93 | 7.16 | 0.80 | |
| IN21K | MN3 | 12.34 | err. | err. | 8.69 | 4.39 | 2.52 | 2.40 | 58.84 | 81.16 | 2.00 | 0.21 | err. | 0.20 | 5.50 | 0.17 |
| ViT-B/16 | 6.27 | err. | err. | 8.92 | 6.55 | 2.38 | 1.54 | 8.22 | 3.20 | 2.14 | 0.22 | err. | 0.24 | 7.89 | 0.12 | |
| AFF | T5 | 5.47 | 0.27 | 0.26 | 1.10 | 1.15 | 1.52 | 1.42 | 1.18 | 1.31 | 0.37 | 0.27 | 0.39 | 0.28 | 2.87 | 0.17 |
| RoBERTa | 7.37 | 0.30 | 0.28 | 2.40 | 2.33 | 1.41 | 1.85 | 1.38 | 1.68 | 0.52 | 0.27 | 0.75 | 0.35 | 4.02 | 0.20 | |
| DS | T5 | 8.86 | 1.39 | 1.38 | 2.19 | 2.17 | 6.13 | 2.00 | 5.91 | 2.02 | 1.50 | 1.55 | 1.38 | 1.58 | 1.90 | 1.12 |
| RoBERTa | 16.12 | 1.56 | 1.50 | 12.07 | 12.07 | 14.66 | 6.80 | 13.90 | 5.57 | 1.71 | 1.53 | 1.64 | 1.14 | 1.05 | 0.91 | |
| MNLI | T5 | 7.04 | 0.72 | 0.70 | 2.81 | 2.80 | 4.46 | 1.79 | 4.31 | 1.82 | 0.80 | 0.74 | 1.38 | 0.69 | 2.09 | 0.43 |
| RoBERTa | 9.22 | 0.89 | 0.71 | 5.72 | 5.72 | 6.99 | 1.92 | 6.59 | 1.99 | 1.00 | 0.92 | 1.67 | 0.84 | 1.02 | 0.60 | |
| YA | T5 | 7.84 | 0.80 | 0.81 | 1.07 | 1.35 | 3.70 | 1.16 | 3.75 | 1.15 | 1.73 | 0.82 | 2.81 | 0.96 | 3.65 | 0.69 |
| RoBERTa | 19.59 | 0.97 | 0.79 | 12.39 | 12.38 | 16.47 | 2.52 | 16.07 | 2.21 | 1.92 | 0.99 | 5.00 | 0.75 | 3.41 | 0.58 | |