Unsupervised Methods for Domain Specific Ambiguity Detection. The Case of German Physics Language
- Many terms used in physics have a different meaning or usage pattern in general language, constituting a learning barrier in physics teaching. The systematic identification of such terms is considered to be useful for science education as well as for terminology extraction. This article compares three methods based on vector semantics and a simple frequency-based baseline for automatically identifying terms used in general language with domain-specific use in physics. For evaluation, we use ambiguity scores from a survey among physicists and data about the number of term senses from Wiktionary. We show that the so-called Vector Initialization method obtains the best results.
Author: | Vitor Fontanella, Christian WartenaORCiDGND, Gunnar Friege |
---|---|
URN: | urn:nbn:de:bsz:960-opus4-32098 |
URL: | https://aclanthology.org/2023.iwcs-1.26 |
DOI: | https://doi.org/10.25968/opus-3209 |
Parent Title (English): | Proceedings of the 15th International Conference on Computational Semantics |
Publisher: | Association for Computational Linguistics |
Editor: | Maxime Amblard, Ellen Breitholtz |
Document Type: | Conference Proceeding |
Language: | English |
Year of Completion: | 2023 |
Publishing Institution: | Hochschule Hannover |
Release Date: | 2024/08/07 |
GND Keyword: | Physik; Terminologie; Ambiguität; Automatische Identifikation |
First Page: | 252 |
Last Page: | 257 |
Institutes: | Fakultät III - Medien, Information und Design |
DDC classes: | 020 Bibliotheks- und Informationswissenschaft |
Licence (German): | Creative Commons - CC BY - Namensnennung 4.0 International |