notesum.ai

Published at December 5

FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression

cs.CV

Released Date: December 5, 2024

Authors: Bo Tong1, Bokai Lai, Yiyi Zhou, Gen Luo, Yunhang Shen, Ke Li, Xiaoshuai Sun, Rongrong Ji

Aff.: 1Xiamen University, Key Laboratory of Multimedia Trusted Perception and Efficient Computing, P.R. China

Arxiv: http://arxiv.org/pdf/2412.04317v1