notesum.ai

Published at November 4

Context Parallelism for Scalable Million-Token Inference

cs.DC
cs.AI
cs.LG

Released Date: November 4, 2024

Authors: Amy, Yang, Jingyi Yang1, Aya Ibrahim1, Xinfeng Xie1, Bangsheng Tang1, Grigory Sizov1, Jongsoo Park1, Jianyu Huang1

Aff.: 1Meta Platforms, Inc., Menlo Park, California, USA

Arxiv: http://arxiv.org/abs/2411.01783v1