notesum.ai

Published at October 21

Pre-training Distillation for Large Language Models: A Design Space Exploration

cs.LG
cs.AI
cs.CL

Released Date: October 21, 2024

Authors: Hao Peng1, Xin Lv2, Yushi Bai1, Zijun Yao1, Jiajie Zhang1, Lei Hou1, Juanzi Li1

Aff.: 1Tsinghua University; 2Zhipu AI

Arxiv: https://arxiv.org/abs/2410.16215v1