notesum.ai

Published at November 21

Assessment of LLM Responses to End-user Security Questions

cs.CR

cs.AI

cs.CL

cs.HC

Released Date: November 21, 2024

Authors: Vijay Prakash¹, Kevin Lee², Arkaprabha Bhattacharya³, Danny Yuxing Huang¹, Jessica Staddon⁴

Aff.: ¹NYU; ²JPMorganChase; ³Cornell Tech; ⁴Northeastern University

Arxiv: http://arxiv.org/abs/2411.14571v1

Categorization	Category	Accuracy	Thoroughness	Relevancy	Directness
		Correct	Thorough	Relevant	Direct
	# of responses out of all (900)	0.73	0.68	0.98	0.83
Security Category	Authentication (449)	0.69	0.69	0.98	0.83
	Scams (83)	0.82	0.70	1.00	0.83
	Safe browsing (50)	0.80	0.68	0.96	0.88
	Antivirus (163)	0.61	0.59	0.99	0.86
	Secure network/WiFi (46)	0.87	0.67	0.96	0.70
	Smart devices (19)	0.89	0.63	0.95	0.79
	Updates (90)	0.88	0.81	0.99	0.84
Knowledge Category	Factual (437)	0.76	0.72	0.98	0.84
	Conceptual (164)	0.72	0.77	0.98	0.89
	Procedural (299)	0.69	0.57	0.99	0.78