notesum.ai

Published at November 18

Topology-aware Preemptive Scheduling for Co-located LLM Workloads

cs.DC
cs.AI

Released Date: November 18, 2024

Authors: Ping Zhang1, Lei Su1, Jinjie Yang1, Xin Chen1

Aff.: 1Baichuan-Inc

Arxiv: http://arxiv.org/abs/2411.11560v1