Refine
Year of publication
Document Type
- Conference Proceeding (52) (remove)
Has Fulltext
- yes (52)
Is part of the Bibliography
- no (52)
Keywords
- Text Mining (5)
- Concreteness (4)
- Semantik (4)
- Ausbildung (3)
- Bibliothek (3)
- German (3)
- Information Retrieval (3)
- Informationsmanagement (3)
- Klassifikation (3)
- Bibliothekswesen (2)
Institute
- Fakultät III - Medien, Information und Design (52) (remove)
All of us are aware of the changes in the information field during the last years. We all see the paradigm shift coming up and have some idea how it will challenge our profession in the future. But how the road to excellence - in education of information specialists in the future - will look like? There are different models (new and old ones) for reorganising the structure of education: * Integration * Specialisation * Step-by step-model * Modul System * Network System / Combination model The paper will present the actual level of discussion on building up a new curriculum at the Department of Information and Communication (IK) at the FH Hannover. Based on the mission statement of the department »Education of information professionals is a part of the dynamic evolution of knowledge society« the direction of change and the main goals will be presented. The different reorganisation models will be explained with its objectives, opportunities and forms of implementation. Some examples will show the ideas and tools for a first draft of a reconstruction plan to become fit for the future. This talk has been held at the German-Dutch University Conference »Information Specialists for the 21st Century« at the Fachhochschule Hannover - University of Applied Sciences, Department of Information and Communication, October 14 -15, 1999 in Hannover, Germany.
Data and Information Science: Book of Abstracts at BOBCATSSS 2022 Hybrid Conference, 23rd - 25th of May 2022, Debrecen.
This year marks the 30th anniversary of the BOBCATSSS. The BOBCATSSS is an international, annual symposium designed for librarians and information professionals in a rapidly changing environment. Over the past 30 years, the conference has included exciting topics, great venues, interested guests and engaging presenters.
This year we would like to introduce the topics of the many papers presented in the Book of Abstracts for the first time in presence at the University of Debrecen and hybrid. The Book of Abstracts provides an overview of all presentations given at BOBCATSSS. Presentations are listed in alphabetical order by title and include speeches, Pecha Kuchas, posters and workshops.
The theme of BOBCATSSS is Data and Information Science. Data and information are the basis for decisions and processes in business, politics and science. Particularly important in the current era of digital transformation. This is exactly where this year's subthemes come in. They deal with data science, openness as well as institutional roles.
Lemmatization is a central task in many NLP applications. Despite this importance, the number of (freely) available and easy to use tools for German is very limited. To fill this gap, we developed a simple lemmatizer that can be trained on any lemmatized corpus. For a full form word the tagger tries to find the sequence of morphemes that is most likely to generate that word. From this sequence of tags we can easily derive the stem, the lemma and the part of speech (PoS) of the word. We show (i) that the quality of this approach is comparable to state of the art methods and (ii) that we can improve the results of Part-of-Speech (PoS) tagging when we include the morphological analysis of each word.
Automatic classification of scientific records using the German Subject Heading Authority File (SWD)
(2012)
The following paper deals with an automatic text classification method which does not require training documents. For this method the German Subject Heading Authority File (SWD), provided by the linked data service of the German National Library is used. Recently the SWD was enriched with notations of the Dewey Decimal Classification (DDC). In consequence it became possible to utilize the subject headings as textual representations for the notations of the DDC. Basically, we we derive the classification of a text from the classification of the words in the text given by the thesaurus. The method was tested by classifying 3826 OAI-Records from 7 different repositories. Mean reciprocal rank and recall were chosen as evaluation measure. Direct comparison to a machine learning method has shown that this method is definitely competitive. Thus we can conclude that the enriched version of the SWD provides high quality information with a broad coverage for classification of German scientific articles.
Das ProFormA-Aufgabenformat wurde eingeführt, um den Austausch von Programmieraufgaben zwischen beliebigen Autobewertern (Grader) zu ermöglichen. Ein Autobewerter führt im ProFormA-Aufgabenformat spezifizierte „Tests“ sequentiell aus, um ein vom Studierenden eingereichtes Programm zu prüfen. Für die Strukturierung und Darstellung der Testergebnisse existiert derzeit kein graderübergreifender Standard. Wir schlagen eine Erweiterung des ProFormA-Aufgabenformats um eine Hierarchie von Bewertungsaspekten vor, die nach didaktischen Aspekten gruppiert ist und entsprechende Testausführungen referenziert. Die Erweiterung wurde in Graja umgesetzt, einem Autobewerter für Java-Programme. Je nach gewünschter Detaillierung der Bewertungsaspekte sind Testausführungen in Teilausführungen aufzubrechen. Wir illustrieren unseren Vorschlag mit den Testwerkzeugen Compiler, dynamischer Softwaretest, statische Analyse sowie unter Einsatz menschlicher Bewerter.
Regional Innovation Systems describe the relations between actors, structures and infrastructures in a region in order to stimulate innovation and regional development. For these systems the collection and organization of information is crucial. In the present paper we investigate the possibilities to extract information from websites of companies. First we describe regional innovation systems and the information types that are necessary to create them. Then we discuss the possibilities of text mining and keyword extraction techniques to extract this information from company websites. Finally, we describe a small scale experiment in which keywords related to economic sectors and commodities are extracted from the websites of over 200 companies. This experiment shows what the main challenges are for information extraction from websites for regional innovation systems.
The amount of papers published yearly increases since decades. Libraries need to make these resources accessible and available with classification being an important aspect and part of this process. This paper analyzes prerequisites and possibilities of automatic classification of medical literature. We explain the selection, preprocessing and analysis of data consisting of catalogue datasets from the library of the Hanover Medical School, Lower Saxony, Germany. In the present study, 19,348 documents, represented by notations of library classification systems such as e.g. the Dewey Decimal Classification (DDC), were classified into 514 different classes from the National Library of Medicine (NLM) classification system. The algorithm used was k-nearest-neighbours (kNN). A correct classification rate of 55.7% could be achieved. To the best of our knowledge, this is not only the first research conducted towards the use of the NLM classification in automatic classification but also the first approach that exclusively considers already assigned notations from other
classification systems for this purpose.
The CogALex-V Shared Task provides two datasets that consists of pairs of words along with a classification of their semantic relation. The dataset for the first task distinguishes only between related and unrelated, while the second data set distinguishes several types of semantic relations. A number of recent papers propose to construct a feature vector that represents a pair of words by applying a pairwise simple operation to all elements of the feature vector. Subsequently, the pairs can be classified by training any classification algorithm on these vectors. In the present paper we apply this method to the provided datasets. We see that the results are not better than from the given simple baseline. We conclude that the results of the investigated method are strongly depended on the type of data to which it is applied.
A new FOSS (free and open source software) toolchain and associated workflow is being developed in the context of NFDI4Culture, a German consortium of research- and cultural heritage institutions working towards a shared infrastructure for research data that meets the needs of 21st century data creators, maintainers and end users across the broad spectrum of the digital libraries and archives field, and the digital humanities. This short paper and demo present how the integrated toolchain connects: 1) OpenRefine - for data reconciliation and batch upload; 2) Wikibase - for linked open data (LOD) storage; and 3) Kompakkt - for rendering and annotating 3D models. The presentation is aimed at librarians, digital curators and data managers interested in learning how to manage research datasets containing 3D media, and how to make them available within an open data environment with 3D-rendering and collaborative annotation features.
For the analysis of contract texts, validated model texts, such as model clauses, can be used to identify used contract clauses. This paper investigates how the similarity between titles of model clauses and headings extracted from contracts can be computed, and which similarity measure is most suitable for this. For the calculation of the similarities between title pairs we tested various variants of string similarity and token based similarity. We also compare two additional semantic similarity measures based on word embeddings using pre-trained embeddings and word embeddings trained on contract texts. The identification of the model clause title can be used as a starting point for the mapping of clauses found in contracts to verified clauses.
Building a well-founded understanding of the concepts, tasks and limitations of IT in all areas of society is an essential prerequisite for future developments in business and research. This applies in particular to the healthcare sector and medical research, which are affected by the noticeable advances in digitization. In the transfer project “Zukunftslabor Gesundheit” (ZLG), a teaching framework was developed to support the development of further education online courses in order to teach heterogeneous groups of learners independent of location and prior knowledge. The study at hand describes the development and components of the framework.
Bei der Konzeption und Entwicklung der BID-Studiengänge ist neben den inhaltlichen und studienorganisatorischen Überlegungen die Ableitung und Entwicklung realistischer Planungsdaten eine der Hauptaufgaben des Modellversuchs BID und eine wesentliche Voraussetzung für ihre erfolgreiche Umsetzung in die Praxis gewesen. Auf diese Planungsergebnisse und die Umsetzung wird in diesem Beitrag vor allem einzugehen sein.
Discovery and efficient reuse of technology pictures using Wikimedia infrastructures. A proposal
(2016)
Multimedia objects, especially images and figures, are essential for the visualization and interpretation of research findings. The distribution and reuse of these scientific objects is significantly improved under open access conditions, for instance in Wikipedia articles, in research literature, as well as in education and knowledge dissemination, where licensing of images often represents a serious barrier.
Whereas scientific publications are retrievable through library portals or other online search services due to standardized indices there is no targeted retrieval and access to the accompanying images and figures yet. Consequently there is a great demand to develop standardized indexing methods for these multimedia open access objects in order to improve the accessibility to this material.
With our proposal, we hope to serve a broad audience which looks up a scientific or technical term in a web search portal first. Until now, this audience has little chance to find an openly accessible and reusable image narrowly matching their search term on first try - frustratingly so, even if there is in fact such an image included in some open access article.
Editorial for the 15th European Networked Knowledge Organization Systems Workshop (NKOS 2016)
(2016)
Knowledge Organization Systems (KOS), in the form of classification systems, thesauri, lexical databases, ontologies, and taxonomies, play a crucial role in digital information management and applications generally. Carrying semantics in a well-controlled and documented way, Knowledge Organisation Systems serve a variety of important functions: tools for representation and indexing of information and documents, knowledge-based support to information searchers, semantic road maps to domains and disciplines, communication tool by providing conceptual framework, and conceptual basis for knowledge based systems, e.g. automated classification systems. New networked KOS (NKOS) services and applications are emerging, and we have reached a stage where many KOS standards exist and the integration of linked services is no longer just a future scenario. This editorial describes the workshop outline and overview of presented papers at the 15th European Networked Knowledge Organization Systems Workshop (NKOS 2016) in Hannover, Germany.
Editorial for the 17th European Networked Knowledge Organization Systems Workshop (NKOS 2017)
(2017)
Knowledge Organization Systems (KOS), in the form of classification systems, thesauri, lexical databases, ontologies, and taxonomies, play a crucial role in digital information management and applications generally. Carrying semantics in a well-controlled and documented way, Knowledge Organization Systems serve a variety of important functions: tools for representation and indexing of information and documents, knowledge-based support to information searchers, semantic road maps to domains and disciplines, communication tool by providing conceptual framework, and conceptual basis for knowledge based systems, e.g. automated classification systems. New networked KOS (NKOS) services and applications are emerging, and we have reached a stage where many KOS standards exist and the integration of linked services is no longer just a future scenario. This editorial describes the workshop outline and overview of presented papers at the 17th European Networked Knowledge Organization Systems Workshop (NKOS 2017) which was held during the TPDL 2017 Conference in Thessaloniki, Greece.
Vorgestellt wird ein Ansatz zur objektorientierten Modellierung, Simulation und Animation von Informationssystemen. Es wird ein Vorgehensmodell dargestellt, mit dem unter Verwendung des beschriebenen Ansatzes Anforderungs- oder Systemspezifikationen von Rechnergestützten Informationssystemen erstellt werden können. Der Ansatz basiert auf einem Metamodell zur Beschreibung Rechnergestützter Informationssysteme und verfügt über eine rechnergestützte Modellierungsumgebung. Anhand eines Projektes zur Entwicklung einer Anforderungsspezifikation für ein rechnergestütztes Pflegedokumentations- und -kommunikationssystems wird der Einsatz der Methode beispielhaft illustriert.
This paper presents a possibility to extend the formalism of linear indexed grammars. The extension is based on the use of tuples of pushdowns instead of one pushdown to store indices during a derivation. If a restriction on the accessibility of the pushdowns is used, it can be shown that the resulting formalisms give rise to a hierarchy of languages that is equivalent with a hierarchy defined by Weir. For this equivalence, that was already known for a slightly different formalism, this paper gives a new proof. Since all languages of Weir's hierarchy are known to be mildly context sensitive, the proposed extensions of LIGs become comparable with extensions of tree adjoining grammars and head grammars.
Generalisierte Rechtsdokumente, bei denen für die individuellen Ausprägungen eines Vertrages die Positionen im Text bekannt sind, können eingesetzt werden, um erstens das Genehmigungsverfahren von Neuverträgen automatisiert zu unterstützen und zweitens als Vertragsgenerator neue Rechtsdokumente vorausgewählt zur Verfügung zu stellen. In diesem Beitrag wird, mithilfe von bekannten juristischen Texten gezeigt, wie formelhafte Textabschnitte identifiziert und häufige individuelle Ausprägungen klassifiziert werden können, um als Musterabschnitte eingesetzt zu werden. Es werden Einsatzbereiche vorgestellt und vorhandenes Potential für Legal Tech-Anwendungen aufgezeigt.
„Grappa“ ist eine Middleware, die auf die Anbindung verschiedener Autobewerter an verschiedene E-Learning-Frontends respektive Lernmanagementsysteme (LMS) spezialisiert ist. Ein Prototyp befindet sich seit mehreren Semestern an der Hochschule Hannover mit dem LMS „moodle“ und dem Backend „aSQLg“ im Einsatz und wird regelmäßig evaluiert. Dieser Beitrag stellt den aktuellen Entwicklungsstand von Grappa nach diversen Neu- und Weiterentwicklungen vor. Nach einem Bericht über zuletzt gesammelte Erfahrungen mit der genannten Kombination von Systemen stellen wir wesentliche Neuerungen der moodle-Plugins, welche der Steuerung von Grappa aus moodle heraus dienen, vor. Anschließend stellen wir eine Erweiterung der bisherigen Architektur in Form eines neuentwickelten Grappa-php-Clients zur effizienteren Anbindung von LMS vor. Weiterhin berichten wir über die Anbindung eines weiteren Autobewerters „Graja“ für Programmieraufgaben in Java. Der Bericht zeigt, dass bereits wichtige Schritte für eine einheitliche Darstellung automatisierter Programmbewertung in LMS mit unterschiedlichen Autobewertern für die Studierenden absolviert sind. Die praktischen Erfahrungen zeigen aber auch, dass sowohl bei jeder der Systemkomponenten individuell, wie auch in deren Zusammenspiel via Grappa noch weitere Entwicklungsarbeiten erforderlich sind, um die Akzeptanz und Nutzung bei Studierenden sowie Lehrenden weiter zu steigern.