notesum.ai
Published at November 27MetaphorShare: A Dynamic Collaborative Repository of Open Metaphor Datasets
cs.CL
Released Date: November 27, 2024
Authors: Joanne Boisson1, Arif Mehmood1, Jose Camacho-Collados1
Aff.: 1Cardiff NLP, School of Computer Science and Informatics, Cardiff University, United Kingdom

| Dataset | Reference | License | N | N dist. | N dist. | % | Expr. | Domain/Source |
| Name | ctxt. | expr. | met. | PoS | ||||
| Psycholinguistics | ||||||||
| JANK | Jankowiak (2020) | CC BY 4.0 | 240222Anomalous sentences are not shown in the table. | 240 | 120 | 50 | N | constructed examples |
| CARD_V | Cardillo et al. (2010) | CC BY-NC | 280 | 280 | 140 | 50 | V | constructed examples |
| CARD_N | Cardillo et al. (2010, 2017) | CC BY-NC | 512 | 512 | 256 | 50 | N | constructed examples |
| NLP | ||||||||
| MOH | Mohammad et al. (2016) | see data page333https://saifmohammad.com/WebPages/metaphor.html | 1632444the original dataset contains 1639 instances. A few duplicated example sentences caused by orthographic variants of the target word, such as distil/distill have been removed. | 1632 | 439 | 25 | V | WordNet examples |
| NewsMet | Joseph et al. (2023) | Apache-2.0 | 1205555We show the manually labelled sentences (named gold by the authors) | 1205 | 477 | 49 | V | Fake News Corpus666https://github.com/several27/FakeNewsCorpus |
| TSV_A | Tsvetkov et al. (2014) | see data page777https://github.com/ytsvetko/metaphor/blob/master/LICENSE.md | 1945 | 1072 | 687 | 50 | A | various websites |
| GUT | Gutiérrez et al. (2016) | AFL-3.0 | 8591 | 3479 | 23 | 54 | A | Wikipedia, UKWaC888Ferraresi et al. (2008)… |
| Multi-Word Expressions | ||||||||
| PVC | Tu and Roth (2012) | no license999https://cogcomp.seas.upenn.edu/page/resource_view/26 | 1348 | 1348 | 23 | 65 | V-Prep | BNC |
| MAD | Tayyar Madabushi et al. (2022b) | GPL-3.0 | 4558 | 4554 | 251 | 48 | NC | Common Crawl |
| MAGPIE | Haagsma et al. (2020) | CC-BY-4.0 | 48395 | 47283 | 9307 | 75 | Various | BNC, PMB101010Abzianidze et al. (2017). |
| MIPVU | ||||||||
| VUAC_BO | Boisson et al. (2023) | CC BY-SA 3.0 | 39223 | 11476 | 8674 | 52 | Various | VUAC |
| TONG | Tong et al. (2024) | CC-BY-4.0 | 1428111111Original VUAC sentences and apt (original dataset label) paraphrases are counted in the table. | 739 | 861 | 46 | Various | VUAC & paraphrases |