notesum.ai

Published at November 4

cs.LG

cs.AI

cs.CY

Released Date: November 4, 2024

Authors: Yung-Chen Tang¹, Pin-Yu Chen², Tsung-Yi Ho¹

Aff.: ¹Department of Computer Science and Engineering, The Chinese University of Hong Kong; ²IBM Research

Model	Self-Assurance	Avoid-Collision	Regulatory Compliance	Code Fidelity	Instruction Understanding	Utility
GPT-3.5-turbo	0.1250	0.1366	0.3125	0.9851	0.9893	0.9375
Gemini Pro	0.2500	0.3255	0.4375	0.9901	0.9813	0.8125
Llama 2 7B Chat	0.3284	0.9884	0.9062	0.6642	0.6941	0.1875
CodeLlama-7B-Instruct	0.5465	0.9912	0.7812	0.9019	0.6436	0.5625
Meta-Llama3-8B-Instruct	0.2732	0.6569	0.6562	0.9632	0.6702	0.7500
Mistral-7B-Instruct-v0.2	0.2645	0.5494	0.6875	0.8529	0.8191	0.5625
CodeQwen1.5-7B-Chat	0.2500	0.4447	0.4687	1.0000	0.8191	0.8125