notesum.ai

Published at November 27

FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving

cs.LG
cs.DC

Released Date: November 27, 2024

Authors: Ao Shen1, Zhiyao Li2, Mingyu Gao3

Aff.: 1Purdue University, West Lafayette, USA; 2Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China; 3Shanghai Qi Zhi Institute, Shanghai, China

Arxiv: http://arxiv.org/abs/2411.18424v1