notesum.ai

Published at December 6

Probing the contents of semantic representations from text, behavior, and brain data using the psychNorms metabase

cs.CL

cs.AI

cs.LG

Released Date: December 6, 2024

Authors: Zak Hussain¹, Rui Mata, Ben R. Newell, Dirk U. Wulff

Aff.: ¹Faculty of Psychology, University of Basel, Basel, Switzerland

Arxiv: http://arxiv.org/pdf/2412.04936v1

REPRESENTATION	Description
fastText CommonCrawl	fastText architecture (Mikolov et al., 2018), trained on CommonCrawl.
GloVe CommonCrawl	GloVe architecture (Pennington et al., 2014), trained on CommonCrawl.
LexVec CommonCrawl	LexVec architecture (Salle et al., 2016), trained on CommonCrawl.
fastText Wiki News	fastText architecture (Mikolov et al., 2018), trained on Wikipedia 2017, UMBC webbase, and statmt.org news.
CBOW GoogleNews	CBOW architecture (Mikolov et al., 2013) trained on the Google News.
fastTextSub OpenSub	fastText subword architecture (Mikolov et al., 2018) trained on OpenSubtitles (Van Paridon and Thompson, 2021).
GloVe Wikipedia	GloVe architecture (Pennington et al., 2014) trained on Wikipedia 2014.
spherical text Wikipedia	Spherical text architecture (Meng et al., 2019) trained on Wikipedia 2019.
GloVe Twitter	GloVe architecture (Pennington et al., 2014) trained on Twitter.
morphoNLM	Recurrent neural network architecture fine-tuned on morphological informative examples (Luong et al., 2013).
norms sensorimotor	Ratings of 6 perceptual modalities and 5 action effectors (Lynott et al., 2020)
SGSoftMax[In/Out]put SWOW*	[Cue/Response] vectors from Skip-gram softmax architecture (as in, e.g., Goldberg and Levy, 2014) trained on SWOW (De Deyne et al., 2019).
PPMI SVD SWOW*	Positive pointwise mutual information (PPMI) followed by singular value decomposition (SVD) of the SWOW cue-response matrix (following, e.g., Richie and Bhatia, 2021; Hussain et al., 2024b).
PPMI SVD EAT*	PPMI followed by SVD of the Edinburgh Associative Thesaurus cue-response matrix (EAT, Kiss et al., 1973).
SVD similarity relatedness*	SVD of a similarity matrix of aggregated and normalized similarity and relatedness judgment datasets.
feature overlap	Cosine similarity matrix of overlapping feature frequency percentages between cue pairs in a feature listing task (Buchanan et al., 2019)
THINGS	Neural network embedding trained to predict odd-one-out judgments of image triplets (Hebart et al., 2020).
experiential attributes	Human ratings on 65 attributes comprising sensory, motor, spatial, temporal, affective, social, and cognitive experiences (Binder et al., 2016)
eye tracking	Features extracted from Gaze patterns while reading, aggregated by Hollenstein et al. (2019) from 7 datasets.
EEG text	Electrode measures while reading sentences (Hollenstein et al., 2018).
EEG speech	Electrode measures while listening to sentences (Broderick et al., 2018).
fMRI text hyper align	1000 randomly-sampled voxels while reading sentences (Wehbe et al., 2014), processed by Hollenstein et al. (2019) and hyper-aligned* across individuals (Heusser et al., 2017).
microarray	Neuron-level recordings while listening to sentences (Jamali et al., 2024).
fMRI speech hyper align	6 regions of interest while listening to sentences, collected and processed by Brennan et al. (2016), and hyper-aligned* across individuals.