notesum.ai
Published at November 22ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data
cs.CL
cs.AI
Released Date: November 22, 2024
Authors: Junhong Shen1, Atishay Jain2, Zedian Xiao2, Ishan Amlekar2, Mouad Hadji2, Aaron Podolny2, Ameet Talwalkar1
Aff.: 1Carnegie Mellon University; 2Scribe
![[Uncaptioned image]](https://arxiv.org/html/2411.15004v1/extracted/6011366/figures/scs.png)
| Model | # Params | Before Fine-Tuning | After Fine-Tuning | ||
|---|---|---|---|---|---|
| EM (%) | Calibrated EM (%) | EM (%) | Calibrated EM (%) | ||
| Mistral-7B-Instruct-v0.3 | 7B | 3.89 | 5.13 | 19.92 | 26.31 |
| Qwen2-7B-Instruct | 7B | 6.06 | 7.92 | 29.34 | 38.72 |
| Llama-3.1-Instruct-8B | 8B | 1.42 | 1.88 | 28.34 | 37.42 |
| Qwen2.5-14B-Instruct | 14B | 8.79 | 11.60 | 31.76 | 41.89 |
| Codestral-22B-v0.1 | 22B | 4.53 | 6.08 | 31.11 | 41.25 |
| Qwen2.5-32B-Instruct | 32B | 10.02 | 13.21 | 32.98 | 43.51 |
| Mixtral-8x7B-Instruct-v0.1 | 56B-A12B | 7.35 | 9.82 | 28.38 | 37.49 |
| Qwen2-57B-A14-Instruct | 57B-A14B | 5.72 | 7.51 | 31.02 | 40.10 |