Volltext-Downloads (blau) und Frontdoor-Views (grau)

A Strategy for Anonymizing Free-Text Medical Reports Using LLM-Aix

  • To develop a decision support system for pediatric cardiology case conferences, the anonymization of 4,000 freetext medical case reports is required. This paper presents an anonymization strategy using LLM-AIx, a tool for structured information extraction based on large language models (LLM). The three-step process involves automatic extraction of personally identifiable information (PII) from the reports, evaluation of the results against a manually annotated ground truth, and replacement of identified PII with surrogate values, including controlled date shifting. Initial tests with six example reports revealed challenges regarding handling multiple attribute occurrences and consistent replacements. Future work will focus on full pipeline implementation and mapping clinical information to standardized terminologies such as SNOMED CT.

Download full text files

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Darian LiehrORCiD, Theodor UdenORCiDGND, Christian WartenaORCiDGND, Volker AhlersORCiDGND, Steffen Oeltze-JafraORCiD, Michael MarschollekORCiDGND, Philipp BeerbaumORCiDGND, Oliver J. BottORCiDGND
URN:urn:nbn:de:bsz:960-opus4-37819
DOI:https://doi.org/10.25968/opus-3781
Parent Title (German):KI-Forum 2025 : KI in Forschung und Lehre an Hochschulen
Publisher:HsH Applied Academics
Place of publication:Hannover
Editor:Hanno Homann, Cedric Rohbani, Jens Christian Will
Document Type:Conference Proceeding
Language:English
Year of Completion:2025
Publishing Institution:Hochschule Hannover
Release Date:2025/12/10
Tag:Anonymization; Case Conference; Large Language Model; Pediatric Cardiology; Personally Identifiable Information
GND Keyword:Großes SprachmodellGND; AnonymisierungGND; KardiologieGND
Page Number:3
First Page:62
Last Page:64
Link to catalogue:1970959789
Institutes:Fakultät III - Medien, Information und Design
Fakultät IV - Wirtschaft und Informatik
Sonstige Einrichtungen
Data|H - Institute for Applied Data Science Hannover
DDC classes:370 Erziehung, Schul- und Bildungswesen
004 Informatik
610 Medizin und Gesundheit
Licence (German):License LogoCreative Commons - CC BY - Namensnennung 4.0 International