notesum.ai

Published at October 31

Parameter-Efficient Fine-Tuning Medical Multimodal Large Language Models for Medical Visual Grounding

cs.CV

cs.AI

Released Date: October 31, 2024

Authors: Jinlong He¹, Pengfei Li¹, Gang Liu¹, Shenjun Zhong²

Aff.: ¹College of Computer Science and Technology, Harbin Engineering University, China; ²Monash Biomedical Imaging, Monash University, Australia

Arxiv: http://arxiv.org/abs/2410.23822v1

Refer to caption

Methods	Pneumonia	Pneumothorax	Consolidation	Atelectasis	Edema	Cardiomegaly	Lung Opacity	Pleural Effusion	mean	wIoU
MSLL [7]	0.425	0.106	0.386	0.388	0.294	0.33	0.325	0.368	0.328	0.308
MedKLIP [4]	0.297	0.091	0.265	0.323	0.327	0.395	0.197	0.216	0.264	0.267
Biovil [6]	0.328	0.137	0.297	0.275	0.213	0.406	0.188	0.224	0.259	0.281
Gloria [5]	0.29	0.116	0.304	0.303	0.201	0.408	0.197	0.33	0.269	0.282
GPT-4v [10]	-	-	-	-	-	-	-	-	0.0833	-
Ours	0.446	0.303	0.343	0.395	0.286	0.592	0.28	0.374	0.377	0.407