notesum.ai
Published at October 31Parameter-Efficient Fine-Tuning Medical Multimodal Large Language Models for Medical Visual Grounding
cs.CV
cs.AI
Released Date: October 31, 2024
Authors: Jinlong He1, Pengfei Li1, Gang Liu1, Shenjun Zhong2
Aff.: 1College of Computer Science and Technology, Harbin Engineering University, China; 2Monash Biomedical Imaging, Monash University, Australia

| Methods | Pneumonia | Pneumothorax | Consolidation | Atelectasis | Edema | Cardiomegaly | Lung Opacity | Pleural Effusion | mean | wIoU |
| MSLL [7] | 0.425 | 0.106 | 0.386 | 0.388 | 0.294 | 0.33 | 0.325 | 0.368 | 0.328 | 0.308 |
| MedKLIP [4] | 0.297 | 0.091 | 0.265 | 0.323 | 0.327 | 0.395 | 0.197 | 0.216 | 0.264 | 0.267 |
| Biovil [6] | 0.328 | 0.137 | 0.297 | 0.275 | 0.213 | 0.406 | 0.188 | 0.224 | 0.259 | 0.281 |
| Gloria [5] | 0.29 | 0.116 | 0.304 | 0.303 | 0.201 | 0.408 | 0.197 | 0.33 | 0.269 | 0.282 |
| GPT-4v [10] | - | - | - | - | - | - | - | - | 0.0833 | - |
| Ours | 0.446 | 0.303 | 0.343 | 0.395 | 0.286 | 0.592 | 0.28 | 0.374 | 0.377 | 0.407 |