notesum.ai

Published at December 5

CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing

eess.AS

cs.CL

cs.LG

cs.SD

Released Date: December 5, 2024

Authors: Yen-Ju Lu¹, Jing Liu, Thomas Thebaud, Laureano Moro-Velazquez, Ariya Rastrow, Najim Dehak, Jesus Villalba

Aff.: ¹Johns Hopkins University

Arxiv: http://arxiv.org/pdf/2412.04425v1

Refer to caption

ASR Adapted.	Bottleneck Dims.	ASR CER $\downarrow$		SV
ASR Adapted.	Bottleneck Dims.	Normal	Few-shots	EER $\downarrow$	DCF $\downarrow$
XLSR	-	29.0	39.0	1.29	0.093
+ ASR-FT	-	17.1	32.2	1.29	0.095
+ ASR-Houlsby	256	20.3	34.6	1.37	0.097
+ ASR-CA-XLSR^L (ours)	256	18.6	31.6	1.15	0.088