Volltext-Downloads (blau) und Frontdoor-Views (grau)
(Leider keine statistischen Daten vom 26.05. – 18.06.2018)
  • search hit 9 of 679
Back to Result List

Using Word Embeddings for Unsupervised Acronym Disambiguation

  • Scientific papers from all disciplines contain many abbreviations and acronyms. In many cases these acronyms are ambiguous. We present a method to choose the contextual correct definition of an acronym that does not require training for each acronym and thus can be applied to a large number of different acronyms with only few instances. We constructed a set of 19,954 examples of 4,365 ambiguous acronyms from image captions in scientific papers along with their contextually correct definition from different domains. We learn word embeddings for all words in the corpus and compare the averaged context vector of the words in the expansion of an acronym with the weighted average vector of the words in the context of the acronym. We show that this method clearly outperforms (classical) cosine similarity. Furthermore, we show that word embeddings learned from a 1 billion word corpus of scientific exts outperform word embeddings learned from much larger general corpora.

Download full text files

Export metadata

Statistics

frontdoor_oas
Metadaten
Author:Jean CharbonnierORCiD, Christian WartenaORCiDGND
URN:urn:nbn:de:bsz:960-opus4-12653
Parent Title (English):Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, August 20-26, 2018.
Document Type:Conference Proceeding
Language:English
Year of Completion:2018
Creating Corporation:International Committee on Computational Linguistics (ICCL)
Release Date:2018/10/22
Tag:Abbreviations; Acronyms; Disambiguation
GND Keyword:Abkürzung; Akronym; Ambiguität
First Page:2610
Last Page:2619
Institutes:Fakultät III - Medien, Information und Design
DDC classes:020 Bibliotheks- und Informationswissenschaft
Licence (German):License LogoCreative Commons - CC BY - Namensnennung 4.0 International