notesum.ai

Published at November 4

Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse Activation Control

cs.CL
cs.AI

Released Date: November 4, 2024

Authors: Yuxin Xiao1, Chaoqun Wan2, Yonggang Zhang3, Wenxiao Wang4, Binbin Lin5, Xiaofei He6, Xu Shen2, Jieping Ye2

Aff.: 1State Key Lab of CAD&CG, Zhejiang University; 2Alibaba Cloud; 3Hong Kong Baptist University; 4School of Software Technology, Zhejiang University; 5Zhiyuan Research Institute; 6Fabu Inc.

Arxiv: http://arxiv.org/abs/2411.02461v1