notesum.ai

Published at November 26

"Stupid robot, I want to speak to a human!" User Frustration Detection in Task-Oriented Dialog Systems

cs.CL

Released Date: November 26, 2024

Authors: Mireia Hernandez Caralt¹, Ivan Sekulić, Filip Carević, Nghia Khau¹, Diana Nicoleta Popa¹, Bruna Guedes¹, Victor Guimarães¹, Zeyu Yang¹, Andre Manso¹, Meghana Reddy¹, Paolo Rosso¹, Roland Mathis¹

Aff.: ¹Telepathy Labs GmbH, Zürich, Switzerland

Arxiv: http://arxiv.org/abs/2411.17437v1

	UF = 0 (not frustrated)			UF = 1 (frustrated)
	P	R	$F_{1}$	P	R	$F_{1}$	Macro- $F_{1}$
Sentiment Analysis Hartmann et al. (2023)
RoBERTa-Sent-FC	0.73	0.11	0.18	0.34	0.92	0.50	0.34
RoBERTa-Sent-LU	0.77	0.72	0.75	0.51	0.58	0.54	0.65
Emotion Detection
DistilBERT-EmoWoZ-FC Huang (2024)	0.70	0.77	0.73	0.42	0.34	0.38	0.56
DistilBERT-EmoWoZ-LU	0.74	0.68	0.71	0.45	0.52	0.48	0.60
DistilRoBERTa-Emo-FC Hartmann (2022)	0.67	1.00	0.80	0.00	0.00	0.00	0.40
DistilRoBERTa-Emo-LU	0.68	1.00	0.81	0.88	0.04	0.07	0.44
Dialog Breakdown Detection
DBD+LogReg Bodigutla et al. (2020)	0.78	0.93	0.85	0.78	0.46	0.58	0.71
Rule-Based Approach
Keyword Matching	0.67	1.00	0.80	1.00	0.01	0.01	0.41
LLM-based ICL Approach
GPT-4o-zero-shot	0.98	0.83	0.90	0.74	0.96	0.83	0.86
GPT-4o-two-shot	0.85	0.97	0.91	0.92	0.66	0.77	0.84
Llama-3.1-405B-zero-shot	0.99	0.74	0.85	0.67	0.99	0.79	0.83
Llama-3.1-405B-two-shot	0.97	0.84	0.90	0.75	0.96	0.84	0.87