notesum.ai
Published at December 10Comateformer: Combined Attention Transformer for Semantic Sentence Matching
cs.CL
Released Date: December 10, 2024
Authors: Bo Li1, Di Liang2, Zixin Zhang1
Aff.: 1School of Software, Tsinghua University, Beijing, China; 2Baidu Inc., Beijing, China

| Model | Pre-train | MRPC | QQP | STS-B | MNLI-m/mm | QNLI | RTE | SNLI | Sci | SICK | Twi | Avg |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BiMPM | ✗ | 79.6 | 85.0 | - | 72.3/72.1 | 81.4 | 56.4 | - | - | - | - | - |
| CAFE | ✗ | 82.4 | 88.0 | - | 78.7/77.9 | 81.5 | 56.8 | 88.5 | 83.3 | 72.3 | - | - |
| ESIM | ✗ | 80.3 | 88.2 | - | 75.8/75.6 | 80.5 | - | 88.0 | 70.6 | 71.8 | - | - |
| Transformer | ✗ | 81.7 | 84.4 | 73.6 | 72.3/71.4 | 80.3 | 58.0 | 84.6 | 72.9 | 70.3 | 68.8 | 74.4 |
| BiLSTM+ELMo+Attnt | ✓ | 84.6 | 86.7 | 73.3 | 76.4/76.1 | 79.8 | 56.8 | 89.0 | 85.8 | 78.9 | 81.4 | 78.9 |
| OpenAI GPT | ✓ | 82.3 | 81.3 | 80.0 | 82.1/81.4 | 87.4 | 56.0 | 88.4 | 84.8 | 79.5 | 81.9 | 80.4 |
| UERBERT | ✓ | 88.3 | 90.5 | 85.1 | 84.2/83.5 | 90.6 | 67.1 | 90.8 | 92.2 | 87.8 | 86.2 | 86.0 |
| SemBERT | ✓ | 88.2 | 90.2 | 87.3 | 84.4/84.0 | 90.9 | 69.3 | 90.9 | 92.5 | 87.9 | 86.8 | 86.5 |
| SyntaxBERT | ✓ | 89.2 | 89.6 | 88.1 | 84.9/84.6 | 91.1 | 68.9 | 91.0 | 92.7 | 88.7 | 87.3 | 86.3 |
| DABERT | ✓ | 89.1 | 91.3 | 88.2 | 84.9/84.7 | 91.4 | 69.5 | 91.3 | 93.6 | 88.6 | 87.5 | 86.7 |
| BERT-Base | ✓ | 87.2 | 89.1 | 86.8 | 84.3/83.7 | 90.4 | 67.2 | 90.7 | 91.8 | 87.2 | 84.8 | 85.8 |
| BERT-Base-Comateformer | ✓ | 89.3 | 89.6 | 87.3 | 85.2/84.9 | 91.1 | 68.9 | 91.2 | 92.4 | 88.0 | 86.8 | 86.9 |
| BERT-Large | ✓ | 88.9 | 89.3 | 87.6 | 86.8/86.3 | 92.7 | 70.1 | 91.0 | 94.4 | 91.1 | 91.5 | 88.0 |
| BERT-Large-Comateformer | ✓ | 89.7 | 90.4 | 88.1 | 86.9/86.7 | 93.3 | 72.2 | 91.5 | 94.7 | 91.6 | 92.2 | 88.8 |
| RoBERTa-Base | ✓ | 89.3 | 89.6 | 87.4 | 86.3/86.2 | 92.2 | 73.6 | 90.8 | 92.3 | 87.9 | 85.9 | 87.6 |
| RoBERTa-Base-Comateformer | ✓ | 89.8 | 91.1 | 88.4 | 87.5/87.4 | 93.7 | 82.3 | 91.2 | 93.2 | 89.6 | 87.7 | 89.2 |
| RoBERTa-Large | ✓ | 89.4 | 89.7 | 90.2 | 89.5/89.3 | 92.7 | 83.8 | 91.2 | 94.3 | 91.2 | 91.9 | 90.3 |
| RoBERTa-Large-Comateformer | ✓ | 90.3 | 91.4 | 90.9 | 90.1/89.8 | 94.2 | 84.4 | 91.7 | 94.6 | 91.2 | 92.2 | 90.9 |