Refine
Year of publication
Document Type
- Conference Proceeding (29)
- Article (3)
- Report (3)
- Working Paper (2)
- Part of a Book (1)
- Preprint (1)
Has Fulltext
- yes (39)
Is part of the Bibliography
- no (39)
Keywords
- Semantik (5)
- Text Mining (5)
- Concreteness (4)
- Information Retrieval (4)
- Computerlinguistik (3)
- Distributional Semantics (3)
- German (3)
- Klassifikation (3)
- Machine Learning (3)
- Open Access (3)
- Automatische Klassifikation (2)
- Classification (2)
- Contract Analysis (2)
- Deutsch (2)
- Disambiguation (2)
- Informationsmanagement (2)
- Keyword Extraction (2)
- Konkretum <Linguistik> (2)
- Korpus <Linguistik> (2)
- Lemmatization (2)
- Maschinelles Lernen (2)
- Rechtswissenschaften (2)
- Sachtext (2)
- Sprachnorm (2)
- Thesaurus (2)
- Vergleich (2)
- Vertrag (2)
- Wikidata (2)
- Wikimedia Commons (2)
- Ähnlichkeit (2)
- Abbreviations (1)
- Abkürzung (1)
- Acronyms (1)
- Akronym (1)
- Algorithmus (1)
- Ambiguität (1)
- Automatische Identifikation (1)
- Automatische Lemmatisierung (1)
- Azyklischer gerichteter Graph (1)
- Benutzererlebnis (1)
- Bilderkennung (1)
- Bildersprache (1)
- Bildersuchmaschine (1)
- Clustering (1)
- Corpus construction (1)
- Deep Convolutional Networks (1)
- Dewey-Dezimalklassifikation (1)
- Disambiguierung (1)
- Distributionelle Semantik (1)
- Dokumentanalyse (1)
- Erschließung (1)
- Fassung (1)
- Feature and Text Extraction (1)
- Figurative Language (1)
- Formelhafte Textabschnitte (1)
- Graph-based Text Representations (1)
- Illustration (1)
- Image Recognition (1)
- Image Retrieval (1)
- Imagery (1)
- Indexierung <Inhaltserschließung> (1)
- Information Dissemination (1)
- Inhaltserschließung (1)
- Knowledge Maps (1)
- Krankenhaus (1)
- LCSH (1)
- LIG (1)
- Latent Semantic Analysis (1)
- Layout Detection (1)
- Legal Documents (1)
- Legal Writings (1)
- Legende <Bild> (1)
- Lexical Semantics (1)
- Library of Congress (1)
- Linear Indexed Grammars (1)
- Linguistics (1)
- Linguistische Informationswissenschaft (1)
- Markov Models (1)
- Medieninformatik (1)
- Medizinische Bibliothek (1)
- Morphemanalyse (1)
- Morphologie <Linguistik> (1)
- Morphology (1)
- Multimedia (1)
- Multimedia Information Retrieval (1)
- Multimedia Retrieval (1)
- Multimedien (1)
- Notation <Klassifikation> (1)
- Onomastik (1)
- Ortsnamen (1)
- PDF <Dateiformat> (1)
- PDF Document Analysis (1)
- POS Tagging (1)
- Paraphrase (1)
- Paraphrase Similarity (1)
- Part of Speech Tagging (1)
- Passage Retrieval (1)
- Phraseologie (1)
- Physics (1)
- Physik (1)
- Qualitätssicherung (1)
- Rechtsdokumente (1)
- Regional Development (1)
- Regional Innovation Systems (1)
- Regional Policy (1)
- Retrieval (1)
- Schlagwort (1)
- Schlagwortkatalog (1)
- Schlagwortnormdatei (1)
- Scientific Figures (1)
- Scientific image search (1)
- Segmentation (1)
- Segmentierung (1)
- Semantics (1)
- Similarity Measures (1)
- Speech Recognition (1)
- Spracherkennung (1)
- Standardised formulation (1)
- Standardisierung (1)
- Statistical Analysis (1)
- Statistical Methods (1)
- Statistische Analyse (1)
- Statistische Methoden (1)
- Structural Analysis (1)
- Synononym (1)
- Synonymie (1)
- Territorial Intelligence (1)
- Text Segmentation (1)
- Text Similarity (1)
- Text annotation (1)
- Textbooks (1)
- Title Matching (1)
- User Generated Content (1)
- Verbal Idioms (1)
- Versicherungsvertrag (1)
- Vertragsklausel (1)
- Video Segmentation (1)
- Wikipedia categories (1)
- Word Norms (1)
- Wort (1)
- XML (1)
- Zweiwortsatz (1)
- abstractness (1)
- concreteness (1)
- context vectors (1)
- distributional semantics (1)
- supervised machine learning (1)
- thesauri (1)
- word embedding space (1)
- Überwachtes Lernen (1)
Institute
To learn a subject, the acquisition of the associated technical language is important.
Despite this widely accepted importance of learning the technical language, hardly any studies are published that describe the characteristics of most technical languages that students are supposed to learn. This might largely be due to the absence of specialized text corpora to study such languages at lexical, syntactical and textual level. In the present paper we describe a corpus of German physics text that can be used to study the language used in physics. A large and a small variant are compiled. The small version of the corpus consists of 5.3 Million words and is available on request.