Medizinische Informatik, Statistik und Dokumentation

Research area semantics and ontologies in medicine

PI: Stefan Schulz

Focus: The focus of the group is on semantic data modelling of data from science and clinical practice. Two paths are pursued: the construction of symbolic knowledge by experts on the one hand, and automatic knowledge acquisition using machine learning on the other hand. In the former case, data are standardized by terminologies, ontologies and information models, in the latter case semantics is expressed by probabilistic and neural models. Most of the data is only available as text, which explains the focus set on text mining methods. The standardized data extracts obtained support document research, data analysis and clinical decision-making.

Networking: Currently important cooperation partners of the team are KAGes and CBmed, Graz, ELGA GmbH Vienna, as well as Roche Diagnostics (Basel and Belmont). In Germany, there are close contacts with the text mining company Averbis GmbH, Freiburg, EMPIRICA GmbH, Bonn, the Universities of FreiburgandTU Munich, the University of Jena, the Charité Berlin and DFKI Saarbrücken. Global contacts exist through active participation in the standardization organizations SNOMED International and HL-7. Current collaborations with colleagues from the universities of Trondheim, Murcia, Bordeaux, Ljubljana, Buffalo, the Bern University of Applied Sciences and the PUCPR (Brazil) should also be emphasized.

Projects

AIDAVA - AI-powered Data Curation & Publishing Virtual Assistant

  • AIDAVA uses artificial intelligence (AI) to convert patient data of various degrees of structure, particularly from reports and discharge summaries, into a coded form suited for querying. Natural language processing methods and large language models are used. We are primarily involved in the manual annotation of clinical narratives, which are used to train and validate these AI ​​models. Codes from the international ontology-based terminology standard SNOMED CT represent all information contained in these texts. The application of this standard is supported by a comprehensive annotation guideline developed by us. The target format for representing patient-specific information is a so-called knowledge graph, which can be used to query clinical information for care and research in a standardised way. AIDAVA's clinical application domains are breast cancer and ischemic heart disease.
  • Period: 2022-2026
  • Funded by: European Commission
  • Project partners: b!loba, KU Leuven, The European Institute for Innovation through Health Data, European Cancer Patient Coalition, European Heart Network AISBL, ONTO - Sirma AI EAD, NEMC - Sihtasutus Põhja-Eesti Regionaalhaigla, Averbis GmbH, European Research and Project Office GmbH, UM - Maastricht University, Egnosis by Gnome Design Srl, MIDATA Cooperative, Digi.me Ltda

GeMTeX - German Medical Text Corpus

  • We are an external partner of GeMTeX (German Medical Text Corpus), a project that aims to make clinical narratives usable for research projects and thus create the largest collection of clinical summaries in the German language. Since its kick-off in June 2023, texts have been collected at six clinical sites and manually annotated by trained assistants. The resulting data serves as a reference to improve automatic annotations and is used for analyses and the training of statistical models. GeMTeX uses the infrastructure of the German Medical Informatics Initiative (MII) to systematically enrich clinical documents and make them available anonymously. Our contribution to date has been to share the text annotation principles, practices, and experiences from AIDAVA (see above). The AIDAVA annotation guide is used as a reference for GeMTeX's own annotation strategy.
  • Period: 2023-2026
  • Project partners: Charité – University Hospital Berlin, ID GmbH & Co. KGaA, Technical University of Darmstadt, Dresden University of Technology, University Hospital Erlangen, University Hospital Essen, Averbis GmbH, Hannover Medical School, Heidelberg University Hospital, German National Library of Medicine (ZB MED), Leipzig University, University of Leipzig Medical Center, Ludwig Maximilian University of Munich, Technical University of Munich, University of Münster, Hasso Plattner Institute for Digital Engineering gGmbH, Tübingen University Hospital

SNOMED CT-Localisation

  • The international ontology-based terminology standard SNOMED CT has long been a field of activity within our focus on biomedical semantics. We provide international advice in this area, e.g., in the German Translation Group and the Modeling Advisory Group of SNOMED International. With the German Interface Terminology, we provide a large indexing vocabulary in German for SNOMED CT, which has successfully been used for text-mining tasks.
  • Period: 2015 - 2026
  • Cooperations: Averbis GmbH, Freiburg (Germany) and ELGA GmbH, Vienna

Principal Investigator

Stefan Schulz 
T: +43 316 385 16939