notesum.ai

Published at November 15

VMID: A Multimodal Fusion LLM Framework for Detecting and Identifying Misinformation of Short Videos

cs.CV

cs.AI

Released Date: November 15, 2024

Authors: Weihao Zhong¹, Yinhao Xiao¹, Minghui Xu², Xiuzhen Cheng²

Aff.: ¹School of Information Science, Guangdong University of Finance and Economics, Guangdong Intelligent Business Engineering Technology Center, Key Laboratory of Collaborative Innovation in Digital Economy, Guangzhou, China.; ²School of Computer Science, Shandong University, Qingdao, China.

Arxiv: http://arxiv.org/abs/2411.10032v1

Method	Accuracy	F1 Score	Precision	Recall
Hou et al., 2019[19]	71.89%	71.29%	73.88%	71.89%
Medina et al., 2020[20]	75.58%	75.50%	75.92%	75.58%
Choi and Ko, 2021[21]	78.32%	78.31%	78.37%	78.32%
Shang et al., 2021[22]	74.45%	74.39%	74.67%	74.45%
SV-FEND[1]	81.05%	81.02%	81.24%	81.05%
SVRPM[17]	79.34%	78.55 %	79.75%	78.16%
VMID ${}_{\text{Qwen2.5}}$ (Ours)	90.93%	90.89%	90.88%	90.93%