Volltext-Downloads (blau) und Frontdoor-Views (grau)
  • search hit 1 of 1
Back to Result List

Toward a service-based workflow for automated information extraction from herbarium specimens

  • Over the past years, herbarium collections worldwide have started to digitize millions of specimens on an industrial scale. Although the imaging costs are steadily falling, capturing the accompanying label information is still predominantly done manually and develops into the principal cost factor. In order to streamline the process of capturing herbarium specimen metadata, we specified a formal extensible workflow integrating a wide range of automated specimen image analysis services. We implemented the workflow on the basis of OpenRefine together with a plugin for handling service calls and responses. The evolving system presently covers the generation of optical character recognition (OCR) from specimen images, the identification of regions of interest in images and the extraction of meaningful information items from OCR. These implementations were developed as part of the Deutsche Forschungsgemeinschaft funded a standardised and optimised process for data acquisition from digital images of herbarium specimens (StanDAP-Herb) Project.

Download full text files

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Agnes Kirchhoff, Ulrich Bügel, Eduard Santamaria, Fabian Reimeier, Dominik Röpert, Alexander Tebbje, Anton Güntsch, Fernando Chaves, Karl-Heinz SteinkeGND, Walter Berendsohn
URN:urn:nbn:de:bsz:960-opus4-12860
DOI:https://doi.org/10.25968/opus-1286
DOI original:https://doi.org/10.1093/database/bay103
ISSN:1758-0463
Parent Title (English):Database
Document Type:Article
Language:English
Year of Completion:2018
Publishing Institution:Hochschule Hannover
Release Date:2019/01/09
GND Keyword:Bildanalyse; Optische Zeichenerkennung; Metadaten; Herbarium
Volume:2018
First Page:1
Last Page:11
Link to catalogue:1695841980
Institutes:Fakultät I - Elektro- und Informationstechnik
DDC classes:020 Bibliotheks- und Informationswissenschaft
Licence (German):License LogoCreative Commons - CC BY - Namensnennung 4.0 International