notesum.ai

Published at November 21

cs.CV

cs.AI

Released Date: November 21, 2024

Authors: Lei Jiang¹, Weizhe Huang¹, Tongxuan Liu¹, Yuting Zeng¹, Jing Li¹, Lechao Cheng², Xiaohua Xu¹

Aff.: ¹University of Science and Technology of China; ²Hefei University of Technology

Model	Ratio	Accuracy Performance							Inference Efficiency
		Ai2D	GQA	MMMU	SQA	POPE	TextVQA	Ocrbench	TTFT		TPOT		GPU
		Ai2D	GQA	MMMU	SQA	POPE	TextVQA	Ocrbench	(ms	$S_{p}$ )	(ms/tok.	$S_{p}$ )	GB
LLaVA- NeXT-8B	100%	71.66	65.38	40.22	79.44	87.84	65.43	54.90	94	-	25.61	-	17.88
	75%	70.69	65.21	39.78	79.91	87.87	64.14	53.20	88	1.18x	25.40	1.01x	17.33
	50%	70.02	64.82	39.67	79.39	87.13	62.86	49.50	57	1.66x	24.80	1.03x	16.98
	25%	68.01	63.00	39.22	79.27	86.88	61.24	45.90	52	1.83x	24.05	1.07x	16.98
LLaVA- 1.6-7B	100%	66.58	64.24	35.10	73.21	87.61	64.90	52.20	88	-	23.70	-	16.15
	75%	65.54	64.13	37.00	73.19	87.93	63.00	51.30	81	1.09x	23.55	1.01x	15.44
	50%	64.83	63.83	37.33	72.91	87.93	63.01	47.70	60	1.47x	23.05	1.03x	14.81
	25%	64.35	62.26	36.67	72.41	86.83	60.81	44.60	50	1.78x	21.97	1.08x	14.57
LLaVA- 1.6-13B	100%	70.30	65.37	35.90	75.85	87.56	67.10	55.10	198	-	38.30	-	29.53
	75%	69.56	65.43	36.44	75.95	87.78	66.03	53.90	153	1.29x	36.06	1.06x	28.53
	50%	68.98	65.15	37.11	76.23	87.84	64.33	50.10	116	1.70x	33.35	1.15x	27.53
	25%	67.81	63.41	37.56	76.00	86.71	62.58	46.30	79	2.52x	30.94	1.24x	26.54