notesum.ai
Published at November 4Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models
cs.CL
cs.AI
cs.CE
cs.LG
Released Date: November 4, 2024
Authors: Jonas Zausinger1, Lars Pennig1, Kacper Chlodny1, Vincent Limbach1, Anna Ketteler1, Thorben Prein1, Vishwa Mohan Singh2, Michael Morris Danziger3, Jannis Born3
Aff.: 1TU Munich, Germany; TUM.AI, Germany; 2TUM.AI, Germany; LMU Munich, Germany; 3IBM Research Europe, Switzerland

| Model | Acc. | MAE | R2 |
|---|---|---|---|
| Standard T5 | .6448 | .1303 | .9688 |
| Standard + NTL-MSE | .7189 | .1091 | .9739 |
| Standard + NTL-WAS | .7460 | .0980 | .9766 |
| RT | .7136 | .1135 | .9701 |
| RT + NTL-MSE | .6990 | .1291 | .9580 |
| xVal | .0000 | .2581 | .9735 |