notesum.ai
Published at November 6Deploying Multi-task Online Server with Large Language Model
cs.CL
cs.AI
Released Date: November 6, 2024
Authors: Yincen Qu1, Chao Ma1, Xiangying Dai1, Hui Zhou1, Yiting Wu1, Hengyue Liu2
Aff.: 1Trip.com Group, Shanghai, China; 2Independent Researcher
| Models | Methods |
|
|
|
|
|
|
|
|
|
|||||||||||||||
| LLaMA | Single-task | 70.22 | 58.71 | 87.06 | 73.98 | 58.39 | 79.23 | 71.26 | 6 | 100% | |||||||||||||||
| Few-shot | 65.07 | 13.82 | 62.10 | 46.80 | 14.57 | 54.47 | 42.81 | 0 | - | ||||||||||||||||
| Instance-balanced | 68.75 | 56.20 | 85.02 | 73.52 | 59.13 | 80.05 | 70.44 | 3 | 33.3% | ||||||||||||||||
| Class-balanced | 69.12 | 57.39 | 83.34 | 74.05 | 59.33 | 80.66 | 70.64 | 3 | 33.3% | ||||||||||||||||
| UniMax | 68.01 | 56.55 | 84.65 | 74.75 | 57.56 | 82.42 | 70.65 | 2 | 50.0% | ||||||||||||||||
| ours | 70.06 | 57.31 | 87.51 | 74.68 | 58.79 | 80.83 | 71.53 | 5 | 20.0% | ||||||||||||||||
| ours (w/o 2-stage) | 70.22 | 56.32 | 87.03 | 73.03 | 60.11 | 81.91 | 71.76 | 4 | 25.0% | ||||||||||||||||
| Qwen | Single-task | 71.69 | 60.16 | 83.54 | 74.12 | 58.31 | 86.52 | 72.39 | 6 | 100% | |||||||||||||||
| Few-shot | 65.44 | 22.84 | 66.62 | 53.54 | 17.63 | 73.00 | 49.85 | 0 | - | ||||||||||||||||
| Instance-balanced | 68.75 | 59.51 | 82.81 | 74.44 | 59.56 | 82.76 | 71.30 | 3 | 33.3% | ||||||||||||||||
| Class-balanced | 71.69 | 58.20 | 85.56 | 74.12 | 58.54 | 80.01 | 71.35 | 4 | 25.0% | ||||||||||||||||
| UniMax | 70.59 | 59.63 | 82.74 | 74.14 | 59.68 | 83.37 | 71.69 | 4 | 25.0% | ||||||||||||||||
| ours | 71.32 | 59.59 | 86.03 | 74.56 | 58.86 | 83.67 | 72.33 | 5 | 20.0% | ||||||||||||||||
| ours (w/o 2-stage) | 71.32 | 58.32 | 86.60 | 74.18 | 59.36 | 82.99 | 72.13 | 4 | 25.0% |