Refine
Year of publication
- 2018 (4) (remove)
Document Type
- Conference Proceeding (4) (remove)
Language
- English (4)
Has Fulltext
- yes (4)
Is part of the Bibliography
- no (4)
Keywords
- Abbreviations (1)
- Abkürzung (1)
- Acronyms (1)
- Akronym (1)
- Ambiguität (1)
- Bildersuchmaschine (1)
- Disambiguation (1)
- Image Retrieval (1)
- Open Access (1)
- Scientific image search (1)
Institute
- Fakultät III - Medien, Information und Design (4) (remove)
NOA is a search engine for scientific images from open access publications based on full text indexing of all text referring to the images and filtering for disciplines and image type. Images will be annotated with Wikipedia categories for better discoverability and for uploading to WikiCommons. Currently we have indexed approximately 2,7 Million images from over 710 000 scientific papers from all fields of science.
Scientific papers from all disciplines contain many abbreviations and acronyms. In many cases these acronyms are ambiguous. We present a method to choose the contextual correct definition of an acronym that does not require training for each acronym and thus can be applied to a large number of different acronyms with only few instances. We constructed a set of 19,954 examples of 4,365 ambiguous acronyms from image captions in scientific papers along with their contextually correct definition from different domains. We learn word embeddings for all words in the corpus and compare the averaged context vector of the words in the expansion of an acronym with the weighted average vector of the words in the context of the acronym. We show that this method clearly outperforms (classical) cosine similarity. Furthermore, we show that word embeddings learned from a 1 billion word corpus of scientific exts outperform word embeddings learned from much larger general corpora.
The reuse of scientific raw data is a key demand of Open Science. In the project NOA we foster reuse of scientific images by collecting and uploading them to Wikimedia Commons. In this paper we present a text-based annotation method that proposes Wikipedia categories for open access images. The assigned categories can be used for image retrieval or to upload images to Wikimedia Commons. The annotation basically consists of two phases: extracting salient keywords and mapping these keywords to categories. The results are evaluated on a small record of open access images that were manually annotated.
This paper deals with new job profiles in libraries, mainly systems librarians (German: Systembibliothekare), IT librarians (German: IT-Bibliothekare) and data librarians (German: Datenbibliothekare). It investigates the vacancies and requirements of these positions in the German-speaking countries by analyzing one hundred and fifty published job advertisements of OpenBiblioJobs between 2012-2016. In addition, the distribution of positions, institutional bearers, different job titles as well as time limits, scope of work and remuneration of the positions are evaluated. The analysis of the remuneration in the public sector in Germany also provides information on demands for a bachelor's or master's degree.
The average annual increase in job vacancies between 2012 and 2016 is 14.19%, confirming the need and necessity of these professional library profiles.
The higher remuneration of the positions in data management, in comparison to the systems librarian, proves the prerequisite of the master's degree and thus indicates a desideratum due to missing or few master's degree courses. Accordingly, the range of bachelor's degree courses (or IT-oriented major areas of study with optional compulsory modules in existing bachelor's degree courses) for systems and IT librarians must be further expanded. An alternative could also be modular education programs for librarians and information scientists with professional experience, as it is already the case for music librarians.