004 Informatik
Refine
Year of publication
Document Type
- Conference Proceeding (66)
- Article (50)
- Bachelor Thesis (42)
- Report (20)
- Master's Thesis (14)
- Working Paper (10)
- Book (3)
- Part of a Book (3)
- Other (1)
- Preprint (1)
Is part of the Bibliography
- no (210)
Keywords
- Computersicherheit (17)
- E-Learning (15)
- Künstliche Intelligenz (13)
- Programmierung (12)
- Computerunterstütztes Lernen (11)
- Digitalisierung (11)
- Grader (11)
- Angewandte Botanik (10)
- Autobewerter (10)
- Gepresste Pflanzen (10)
Training and evaluating deep learning models on road graphs for traffic prediction using SUMO
(2024)
The escalation of traffic volume in urban areas poses multifaceted challenges including increased accident risks, congestion, and prolonged travel times. Traditional approaches of expanding road infrastructure face limitations such as space constraints and the potential exacerbation of traffic issues.
Intelligent Transport Systems (ITS) present an alternative strategy to alleviate traffic problems by leveraging data-driven solutions. Central to ITS is traffic prediction, a process vital for applications like Traffic Management and Navigation Systems.
Recent advancements in traffic prediction have witnessed a surge of interest, particularly in deep learning methods optimized for graph-based data processing, being considered the most promising avenue presently.
These methods typically rely on real-life datasets containing traffic sensor data such as METR-LA and PeMS. However, the finite nature of real-life data prompts exploration into augmenting training and testing datasets with simulated traffic data.
This thesis explores the potential of utilizing traffic simulations, employing the microscopic traffic simulator SUMO, to train and test deep learning models for traffic prediction. A framework integrating PyTorch and SUMO is proposed for this purpose, aiming to elucidate the feasibility and effectiveness of using simulated traffic data for enhancing predictive models in traffic management systems.
The Logical Observation Identifiers, Names and Codes (LOINC) is a common terminology used for standardizing laboratory terms. Within the consortium of the HiGHmed project, LOINC is one of the central terminologies used for health data sharing across all university sites. Therefore, linking the LOINC codes to the site-specific tests and measures is one crucial step to reach this goal. In this work we report our ongoing efforts in implementing LOINC to our laboratory information system and research infrastructure, as well as our challenges and the lessons learned. 407 local terms could be mapped to 376 LOINC codes of which 209 are already available to routine laboratory data. In our experience, mapping of local terms to LOINC is a widely manual and time consuming process for reasons of language and expert knowledge of local laboratory procedures.
The German Corona Consensus (GECCO) established a uniform dataset in FHIR format for exchanging and sharing interoperable COVID-19 patient specific data between health information systems (HIS) for universities. For sharing the COVID-19 information with other locations that use openEHR, the data are to be converted in FHIR format. In this paper, we introduce our solution through a web-tool named “openEHR-to-FHIR” that converts compositions from an openEHR repository and stores in their respective GECCO FHIR profiles. The tool provides a REST web service for ad hoc conversion of openEHR compositions to FHIR profiles.
Renewable energy production is one of the strongest rising markets and further extreme growth can be anticipated due to desire of increased sustainability in many parts of the world. With the rising adoption of renewable power production, such facilities are increasingly attractive targets for cyber attacks. At the same time higher requirements on a reliable production are raised. In this paper we propose a concept that improves monitoring of renewable power plants by detecting anomalous behavior. The system does not only detect an anomaly, it also provides reasoning for the anomaly based on a specific mathematical model of the expected behavior by giving detailed information about various influential factors causing the alert. The set of influential factors can be configured into the system before learning normal behaviour. The concept is based on multidimensional analysis and has been implemented and successfully evaluated on actual data from different providers of wind power plants.
Harmonisation of German Health Care Data Using the OMOP Common Data Model – A Practice Report
(2023)
Data harmonization is an important step in large-scale data analysis and for generating evidence on real world data in healthcare. With the OMOP common data model, a relevant instrument for data harmonization is available that is being promoted by different networks and communities. At the Hannover Medical School (MHH) in Germany, an Enterprise Clinical Research Data Warehouse (ECRDW) is established and harmonization of that data source is the focus of this work. We present MHH’s first implementation of the OMOP common data model on top of the ECRDW data source and demonstrate the challenges concerning the mapping of German healthcare terminologies to a standardized format.
Betreiber von Produktionsanlagen stehen oft vor der Frage, welche Norm für die Absicherung der Anlage gegen Cyberangriffe heranzuziehen ist. Aus dem IT-Bereich ist die Normreihe ISO 27000 bekannt. Im Produktionsbereich wird häufig die Normreihe IEC 62443 herangezogen. Dieser Beitrag gibt einen Überblick über beide Normreihen und schlägt einen Ansatz zur gemeinsamen Nutzung beider Standards vor.
Pathologists need to identify abnormal changes in tissue. With the developing digitalization, the used tissue slides are stored digitally. This enables pathologists to annotate the region of interest with the support of software tools. PathoLearn is a web-based learning platform explicitly developed for the teacher-student scenario, where the goal is that students learn to identify potential abnormal changes. Artificial intelligence (AI) and machine learning (ML) have become very important in medicine. Many health sectors already utilize AI and ML. This will only increase in the future, also in the field of pathology. Therefore, it is important to teach students the fundamentals and concepts of AI and ML early in their studies. Additionally, creating and training AI generally requires knowledge of programming and technical details. This thesis evaluates how this boundary can be overcome by comparing existing end-to-end AI platforms and teaching tools for AI. It was shown that a visual programming editor offers a fitting abstraction for creating neural networks without programming. This was extended with real-time collaboration to enable students to work in groups. Additionally, an automatic training feature was implemented, removing the necessity to know technical details about training neural networks.
After kidney transplantation graft rejection must be prevented. Therefore, a multitude of parameters of the patient is observed pre- and postoperatively. To support this process, the Screen Reject research project is developing a data warehouse optimized for kidney rejection diagnostics. In the course of this project it was discovered that important information are only available in form of free texts instead of structured data and can therefore not be processed by standard ETL tools, which is necessary to establish a digital expert system for rejection diagnostics. Due to this reason, data integration has been improved by a combination of methods from natural language processing and methods from image processing. Based on state-of-the-art data warehousing technologies (Microsoft SSIS), a generic data integration tool has been developed. The tool was evaluated by extracting Banff-classification from 218 pathology reports and extracting HLA mismatches from about 1700 PDF files, both written in german language.
On November 30th, 2022, OpenAI released the large language model ChatGPT, an extension of GPT-3. The AI chatbot provides real-time communication in response to users’ requests. The quality of ChatGPT’s natural speaking answers marks a major shift in how we will use AI-generated information in our day-to-day lives. For a software engineering student, the use cases for ChatGPT are manifold: assessment preparation, translation, and creation of specified source code, to name a few. It can even handle more complex aspects of scientific writing, such as summarizing literature and paraphrasing text. Hence, this position paper addresses the need for discussion of potential approaches for integrating ChatGPT into higher education. Therefore, we focus on articles that address the effects of ChatGPT on higher education in the areas of software engineering and scientific writing. As ChatGPT was only recently released, there have been no peer-reviewed articles on the subject. Thus, we performed a structured grey literature review using Google Scholar to identify preprints of primary studies. In total, five out of 55 preprints are used for our analysis. Furthermore, we held informal discussions and talks with other lecturers and researchers and took into account the authors’ test results from using ChatGPT. We present five challenges and three opportunities for the higher education context that emerge from the release of ChatGPT. The main contribution of this paper is a proposal for how to integrate ChatGPT into higher education in four main areas.
We present an approach towards a data acquisition system for digital twins that uses a 5G net- work for data transmission and localization. The current hardware setup, which utilizes stereo vision and LiDAR for 3D mapping, is explained together with two recorded point cloud data sets. Furthermore, a resulting digital twin comprised of voxelized point cloud data is shown. Ideas for future applications and challenges regarding the system are discussed and an outlook on further development is given.
Bluetooth ist ein weit verbreitetes drahtloses Übertragungsprotokoll, das in vielen mobilen Geräten wie bspw. Tablets, Kopfhörer oder Smartwatches verwendet wird. Bluetooth-fähige Geräte senden mehrmals pro Minute öffentliche Advertisements, die u.a. die einzigartige MAC-Adresse des Gerätes beinhalten. Das Mitschneiden dieser Advertisements mittels Bluetooth-Logger ermöglicht es, Bewegungen der Geräte zu analysieren und lassen somit Rückschlüsse auf die Bewegungen der Besitzenden zu.
Zum Schutz der Privatsphäre werden seit 2014 zufällig erzeugte MAC-Adressen in Advertisements verwendet. Eine sog. randomisierte MAC-Adresse bleibt durchschnittlich 15 Minuten lang gültig und wird dann durch eine neue zufällige Adresse ersetzt. Der Aufenthalt eines Geräts zu einem späteren Zeitpunkt kann nicht bestimmt werden. Dennoch kann der Wechsel eines Geräts von einem Bluetooth-Logger zu einem anderen innerhalb dieser 15 Minuten erkannt und somit eine Bewegung des Gerätes abgeleitet werden.
Durch Apps der Kontaktpersonennachverfolgung wie die Corona-Warn-App (CWA) senden auch vermeintlich inaktive Smartphones Bluetooth-Advertisements. Mit etwa einem Viertel der Aufzeichnungen unterstützt die CWA die Auswertungen dieser experimentellen Arbeit.
Um die praktische Anwendbarkeit zu demonstrieren, wurde der Erlebniszoo Hannover als Testgelände genutzt. Die Auswertung der über sieben Wochen gesammelten Daten ermöglichte die Analyse von Stoßzeiten, stark besuchten Orten und Besucherströmen.
In der vorliegenden Arbeit wird ein Leitfaden zur Einrichtung eines Informationssicherheitsmanagementsystems bei kleinen und mittleren Unternehmen (KMU) entwickelt. Ein Informationssicherheitsmanagementsystem (ISMS) ist im Grunde eine geordnete Ansammlung an Verfahren, Regeln und Maßnahmen zur Wahrung der Sicherheit von Informationen. Mit diesem System wird die Steuerung und Kontrolle der Informationssicherheit durch strukturierte Vorgehensweisen entscheidend verbessert.
In dieser Bachelorarbeit werden in den ersten Kapiteln die relevanten Normen und Richtlinien (insbesondere aus der DIN EN ISO/IEC 27000 Normenfamilie und aus den Richtlinien der VdS 10000 sowie VdS 10005) betrachtet. Auf dieser Grundlage wird in den darauffolgenden Kapiteln der grundlegende Aufbau eines ISMS für Unternehmen erklärt. Im Anschluss wird diese Thematik auf KMU übertragen. Im weiteren Verlauf wird die Entwicklung des Leitfadens beschrieben, wobei auch auf den allgemeinen Aufbau eines Leitfadens eingegangen wird.
Der eigentliche Leitfaden ist separat als eigenständiges Dokument auf dieser Seite zum Download abrufbar. Er kann losgelöst von dieser Arbeit eingesetzt werden.
Die Arbeit untersucht die Anwendung von maschinellem Lernen zur Erkennung von Aktivitäten von Schiffen anhand von AIS-Signalen. Das Automatic Identification System (AIS) wird von Schiffen genutzt, um Informationen über ihren Status in regelmäßigen Intervallen zu übertragen. Auf Basis der Daten wurden mithilfe von Machine Learning-Algorithmen aus der Gruppe der überwachten Klassifikationsalgorithmen Modelle gelernt, die in der Lage sind zu erkennen, welcher Aktivität ein Schiff zu einem Zeitpunkt nachgeht.
Da das erfolgreiche Lernen eines Modells von einer sorgfältigen Datenvorbereitung abhängt, wurden verschiedene Verfahren zur Datenvorbereitung verwendet. Anschließend wurden verschiedene Algorithmen eingesetzt, darunter der Random Forest und k-NN, um Modelle zu lernen.
Die Ergebnisse zeigen, dass die Aktivitäten mit einer Genauigkeit von bis zu 99% erkannt werden konnten, wenn in der Datenvorbereitung geeignete Verfahren gewählt wurden.
In the last years generative models have gained large public attention due to their high level of quality in generated images. In short, generative models learn a distribution from a finite number of samples and are able then to generate infinite other samples. This can be applied to image data. In the past generative models have not been able to generate realistic images, but nowadays the results are almost indistinguishable from real images.
This work provides a comparative study of three generative models: Variational Autoencoder (VAE), Generative Adversarial Network (GAN) and Diffusion Models (DM). The goal is not to provide a definitive ranking indicating which one of them is the best, but to qualitatively and where possible quantitively decide which model is good with respect to a given criterion. Such criteria include realism, generalization and diversity, sampling, training difficulty, parameter efficiency, interpolating and inpainting capabilities, semantic editing as well as implementation difficulty. After a brief introduction of how each model works on the inside, they are compared against each other. The provided images help to see the differences among the models with respect to each criterion.
To give a short outlook on the results of the comparison of the three models, DMs generate most realistic images. They seem to generalize best and have a high variation among the generated images. However, they are based on an iterative process, which makes them the slowest of the three models in terms of sample generation time. On the other hand, GANs and VAEs generate their samples using one single forward-pass. The images generated by GANs are comparable to the DM and the images from VAEs are blurry, which makes them less desirable in comparison to GANs or DMs. However, both the VAE and the GAN, stand out from the DMs with respect to the interpolations and semantic editing, as they have a latent space, which makes space-walks possible and the changes are not as chaotic as in the case of DMs. Furthermore, concept-vectors can be found, which transform a given image along a given feature while leaving other features and structures mostly unchanged, which is difficult to archive with DMs.
There are many aspects of code quality, some of which are difficult to capture or to measure. Despite the importance of software quality, there is a lack of commonly accepted measures or indicators for code quality that can be linked to quality attributes. We investigate software developers’ perceptions of source code quality and the practices they recommend to achieve these qualities. We analyze data from semi-structured interviews with 34 professional software developers, programming teachers and students from Europe and the U.S. For the interviews, participants were asked to bring code examples to exemplify what they consider good and bad code, respectively. Readability and structure were used most commonly as defining properties for quality code. Together with documentation, they were also suggested as the most common target properties for quality improvement. When discussing actual code, developers focused on structure, comprehensibility and readability as quality properties. When analyzing relationships between properties, the most commonly talked about target property was comprehensibility. Documentation, structure and readability were named most frequently as source properties to achieve good comprehensibility. Some of the most important source code properties contributing to code quality as perceived by developers lack clear definitions and are difficult to capture. More research is therefore necessary to measure the structure, comprehensibility and readability of code in ways that matter for developers and to relate these measures of code structure, comprehensibility and readability to common software quality attributes.
PROFINET Security: A Look on Selected Concepts for Secure Communication in the Automation Domain
(2023)
We provide a brief overview of the cryptographic security extensions for PROFINET, as defined and specified by PROFIBUS & PROFINET International (PI). These come in three hierarchically defined Security Classes, called Security Class 1, 2 and 3. Security Class 1 provides basic security improvements with moderate implementation impact on PROFINET components. Security Classes 2 and 3, in contrast, introduce an integrated cryptographic protection of PROFINET communication. We first highlight and discuss the security features that the PROFINET specification offers for future PROFINET products. Then, as our main focus, we take a closer look at some of the technical challenges that were faced during the conceptualization and design of Security Class 2 and 3 features. In particular, we elaborate on how secure application relations between PROFINET components are established and how a disruption-free availability of a secure communication channel is guaranteed despite the need to refresh cryptographic keys regularly. The authors are members of the PI Working Group CB/PG10 Security.
Context: Higher education is changing at an accelerating pace due to the widespread use of digital teaching and emerging technologies. In particular, AI assistants such as ChatGPT pose significant challenges for higher education institutions because they bring change to several areas, such as learning assessments or learning experiences.
Objective: Our objective is to discuss the impact of AI assistants in the context of higher education, outline possible changes to the context, and present recommendations for adapting to change.
Method: We review related work and develop a conceptual structure that visualizes the role of AI assistants in higher education.
Results: The conceptual structure distinguishes between humans, learning, organization, and disruptor, which guides our discussion regarding the implications of AI assistant usage in higher education. The discussion is based on evidence from related literature.
Conclusion: AI assistants will change the context of higher education in a disruptive manner, and the tipping point for this transformation has already been reached. It is in our hands to shape this transformation.
The trend towards the use of Ethernet in automation networks is ongoing. Due to its high flexibility, speed, and bandwidth, Ethernet nowadays is not only widely used in homes and offices worldwide but finding its way into industrial applications. Especially in automation processes, where many field devices send data in relative short time spans, the requirements for a safe and fast data transfer are high. This makes the use of industrial Ethernet essential. A new hardware-layer, specifically tailored for industrial applications, has been introduced in the form of Ethernet-APL (‘Advanced Physical Layer’). Ethernet-APL is based on the Ethernet standard and implements a two-wire Ethernet-based communication for field devices and provides data and power over a two-wire cable. The operation in areas with potentially explosive atmosphere is also possible. This enables a modular, fast, and transparent Ethernet network structure throughout the entire plant. However, by integrating Ethernet-APL into the field, industrial networks in the future will face the challenge of operating at varying datarates at different locations in the network, resulting in a ‘mixed link speed’ network. This can lead to limitations in packet-throughput and consequently to potential packet loss of system relevant data, which must be avoided. Therefore, the purpose of this thesis is to investigate the potential of packet loss in ‘mixed link speed’ networks.