Refine
Year of publication
Document Type
- Conference Proceeding (29)
- Article (3)
- Report (3)
- Working Paper (2)
- Part of a Book (1)
- Preprint (1)
Has Fulltext
- yes (39)
Is part of the Bibliography
- no (39)
Keywords
- Semantik (5)
- Text Mining (5)
- Concreteness (4)
- Information Retrieval (4)
- Computerlinguistik (3)
- Distributional Semantics (3)
- German (3)
- Klassifikation (3)
- Machine Learning (3)
- Open Access (3)
- Automatische Klassifikation (2)
- Classification (2)
- Contract Analysis (2)
- Deutsch (2)
- Disambiguation (2)
- Informationsmanagement (2)
- Keyword Extraction (2)
- Konkretum <Linguistik> (2)
- Korpus <Linguistik> (2)
- Lemmatization (2)
- Maschinelles Lernen (2)
- Rechtswissenschaften (2)
- Sachtext (2)
- Sprachnorm (2)
- Thesaurus (2)
- Vergleich (2)
- Vertrag (2)
- Wikidata (2)
- Wikimedia Commons (2)
- Ähnlichkeit (2)
- Abbreviations (1)
- Abkürzung (1)
- Acronyms (1)
- Akronym (1)
- Algorithmus (1)
- Ambiguität (1)
- Automatische Identifikation (1)
- Automatische Lemmatisierung (1)
- Azyklischer gerichteter Graph (1)
- Benutzererlebnis (1)
- Bilderkennung (1)
- Bildersprache (1)
- Bildersuchmaschine (1)
- Clustering (1)
- Corpus construction (1)
- Deep Convolutional Networks (1)
- Dewey-Dezimalklassifikation (1)
- Disambiguierung (1)
- Distributionelle Semantik (1)
- Dokumentanalyse (1)
- Erschließung (1)
- Fassung (1)
- Feature and Text Extraction (1)
- Figurative Language (1)
- Formelhafte Textabschnitte (1)
- Graph-based Text Representations (1)
- Illustration (1)
- Image Recognition (1)
- Image Retrieval (1)
- Imagery (1)
- Indexierung <Inhaltserschließung> (1)
- Information Dissemination (1)
- Inhaltserschließung (1)
- Knowledge Maps (1)
- Krankenhaus (1)
- LCSH (1)
- LIG (1)
- Latent Semantic Analysis (1)
- Layout Detection (1)
- Legal Documents (1)
- Legal Writings (1)
- Legende <Bild> (1)
- Lexical Semantics (1)
- Library of Congress (1)
- Linear Indexed Grammars (1)
- Linguistics (1)
- Linguistische Informationswissenschaft (1)
- Markov Models (1)
- Medieninformatik (1)
- Medizinische Bibliothek (1)
- Morphemanalyse (1)
- Morphologie <Linguistik> (1)
- Morphology (1)
- Multimedia (1)
- Multimedia Information Retrieval (1)
- Multimedia Retrieval (1)
- Multimedien (1)
- Notation <Klassifikation> (1)
- Onomastik (1)
- Ortsnamen (1)
- PDF <Dateiformat> (1)
- PDF Document Analysis (1)
- POS Tagging (1)
- Paraphrase (1)
- Paraphrase Similarity (1)
- Part of Speech Tagging (1)
- Passage Retrieval (1)
- Phraseologie (1)
- Physics (1)
- Physik (1)
- Qualitätssicherung (1)
- Rechtsdokumente (1)
- Regional Development (1)
- Regional Innovation Systems (1)
- Regional Policy (1)
- Retrieval (1)
- Schlagwort (1)
- Schlagwortkatalog (1)
- Schlagwortnormdatei (1)
- Scientific Figures (1)
- Scientific image search (1)
- Segmentation (1)
- Segmentierung (1)
- Semantics (1)
- Similarity Measures (1)
- Speech Recognition (1)
- Spracherkennung (1)
- Standardised formulation (1)
- Standardisierung (1)
- Statistical Analysis (1)
- Statistical Methods (1)
- Statistische Analyse (1)
- Statistische Methoden (1)
- Structural Analysis (1)
- Synononym (1)
- Synonymie (1)
- Territorial Intelligence (1)
- Text Segmentation (1)
- Text Similarity (1)
- Text annotation (1)
- Textbooks (1)
- Title Matching (1)
- User Generated Content (1)
- Verbal Idioms (1)
- Versicherungsvertrag (1)
- Vertragsklausel (1)
- Video Segmentation (1)
- Wikipedia categories (1)
- Word Norms (1)
- Wort (1)
- XML (1)
- Zweiwortsatz (1)
- abstractness (1)
- concreteness (1)
- context vectors (1)
- distributional semantics (1)
- supervised machine learning (1)
- thesauri (1)
- word embedding space (1)
- Überwachtes Lernen (1)
Institute
Automatic classification of scientific records using the German Subject Heading Authority File (SWD)
(2012)
The following paper deals with an automatic text classification method which does not require training documents. For this method the German Subject Heading Authority File (SWD), provided by the linked data service of the German National Library is used. Recently the SWD was enriched with notations of the Dewey Decimal Classification (DDC). In consequence it became possible to utilize the subject headings as textual representations for the notations of the DDC. Basically, we we derive the classification of a text from the classification of the words in the text given by the thesaurus. The method was tested by classifying 3826 OAI-Records from 7 different repositories. Mean reciprocal rank and recall were chosen as evaluation measure. Direct comparison to a machine learning method has shown that this method is definitely competitive. Thus we can conclude that the enriched version of the SWD provides high quality information with a broad coverage for classification of German scientific articles.