notesum.ai

Published at December 5

Distributed Inference with Minimal Off-Chip Traffic for Transformers on Low-Power MCUs

cs.AR

Released Date: December 5, 2024

Authors: Severin Bochem1, Victor J. B. Jung, Arpan Prasad2, Francesco Conti3, Luca Benini4

Aff.: 1D-ITET, ETH Zurich, Switzerland; 2Integrated Systems Laboratory, ETH Zurich, Switzerland; 3DEI, and Information Engineering, University of Bologna, Italy; 4Integrated Systems Laboratory, ETH Zurich, Switzerland; DEI, and Information Engineering, University of Bologna, Italy

Arxiv: http://arxiv.org/pdf/2412.04372v1