notesum.ai

Published at October 31

ALISE: Accelerating Large Language Model Serving with Speculative Scheduling

cs.PF
cs.AI

Released Date: October 31, 2024

Authors: Youpeng Zhao1, Jun Wang1

Aff.: 1University of Central Florida, Orlando, FL, USA

Arxiv: http://arxiv.org/abs/2410.23537v1