notesum.ai
Published at November 15Legal Evalutions and Challenges of Large Language Models
cs.CL
cs.AI
Released Date: November 15, 2024
Authors: Jiaqi Wang, Huan Zhao, Zhenyuan Yang, Peng Shu, Junhao Chen, Haobo Sun, Ruixi Liang, Shixin Li, Pengcheng Shi, Longjun Ma, Zongjia Liu, Zhengliang Liu, Tianyang Zhong, Yutong Zhang, Chong Ma, Xin Zhang, Tuo Zhang, Tianli Ding, Yudan Ren, Tianming Liu, Xi Jiang, Shu Zhang
| Model | Chinese_ROUGE-1 | Chinese_ROUGE-2 | Chinese_ROUGE-L | Chinese_BLEU | Chinese_Evaluation |
|---|---|---|---|---|---|
| Gemma2-9B | 0.39 | 0.15 | 0.39 | 0.03 | 3.00 |
| GLM-4-9B-chat | 0.29 | 0.16 | 0.24 | 0.00 | 3.15 |
| GPT-4o | 0.13 | 0.01 | 0.10 | 0.00 | 3.85 |
| LawGPT_zh | 0.27 | 0.08 | 0.16 | 0.04 | 1.85 |
| lawyer-llama-13b-v2 | 0.32 | 0.19 | 0.32 | 0.05 | 2.92 |
| llama3.2-3B-instruct | 0.30 | 0.11 | 0.15 | 0.04 | 1.62 |
| Mistral-7B-instruct-v0.3 | 0.38 | 0.15 | 0.20 | 0.07 | 2.54 |
| O1-preview | 0.13 | 0.02 | 0.09 | 0.00 | 3.85 |
| Phi-3.5-mini-instruct | 0.38 | 0.13 | 0.38 | 0.03 | 2.15 |
| Qwen2-7B-Instruct | 0.27 | 0.16 | 0.23 | 0.00 | 3.85 |