Volltext-Downloads (blau) und Frontdoor-Views (grau)

Representing Standard Text Formulations as Directed Graphs

  • In order to ensure validity in legal texts like contracts and case law, lawyers rely on standardised formulations that are written carefully but also represent a kind of code with a meaning and function known to all legal experts. Using directed (acyclic) graphs to represent standardized text fragments, we are able to capture variations concerning time specifications, slight rephrasings, names, places and also OCR errors. We show how we can find such text fragments by sentence clustering, pattern detection and clustering patterns. To test the proposed methods, we use two corpora of German contracts and court decisions, specially compiled for this purpose. However, the entire process for representing standardised text fragments is language-agnostic. We analyze and compare both corpora and give an quantitative and qualitative analysis of the text fragments found and present a number of examples from both corpora.

Download full text files

Export metadata


Author:Frieda JosiORCiD, Christian WartenaORCiDGND, Ulrich Heid
DOI original:https://doi.org/10.1007/978-3-030-86159-9_34
Parent Title (English):Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science, vol. 12917
Place of publication:Cham
Editor:Elisa H. Barney Smith E.H., Umapada Pal
Document Type:Conference Proceeding
Year of Completion:2021
Publishing Institution:Hochschule Hannover
Release Date:2021/09/13
Tag:Graph-based Text Representations; Legal Writings; Standardised formulation
GND Keyword:Azyklischer gerichteter Graph; Sprachnorm; Sachtext; Rechtswissenschaften
First Page:475
Last Page:487
Link to catalogue:177823822X
Institutes:Fakultät III - Medien, Information und Design
DDC classes:020 Bibliotheks- und Informationswissenschaft
Licence (German):License LogoUrheberrechtlich geschützt