Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: Journal of Medical Internet Research

Date Submitted: Jan 29, 2026
Open Peer Review Period: Feb 2, 2026 - Mar 30, 2026
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Enhancing Healthcare Interoperability Using Large Language Models: A Generative Proof-of-Concept Framework to Extract Medical Information from Unstructured Clinical Text

  • Bahadır Eryılmaz; 
  • Kamyar Arzideh; 
  • Mikel Bahn; 
  • Hendrik Damm; 
  • Sina Warmer; 
  • Henning Schäfer; 
  • Ahmad Idrissi-Yaghir; 
  • Tabea Pakull; 
  • Lea Jessica Albrecht; 
  • Jens Kleesiek; 
  • Georg Lodde; 
  • Christoph M. Friedrich; 
  • Elisabeth Livingstone; 
  • Dirk Schadendorf; 
  • Katarzyna Borys; 
  • Felix Nensa; 
  • René Hosch

ABSTRACT

Background:

Unstructured clinical text remains a major barrier to interoperable data reuse and large-scale secondary analysis in healthcare. Large language models (LLMs) have the potential to automate the extraction of structured clinical information; however, their application is limited by the scarcity of high-quality annotated training data.

Objective:

-

Methods:

We evaluated an LLM–based pipeline for extracting structured clinical information from cancer-related discharge letters and mapping it to representations compatible with Fast Healthcare Interoperability Resources (FHIR). To enable large-scale supervised training, we developed a random sample generator that creates synthetic discharge letters using Qwen3 235B by randomly sampling and aggregating structured FHIR data from 41,175 cancer patients. The resulting synthetic discharge letters (n=75k) were paired with their originating structured data, forming a large-scale dataset for fine-tuning MedGemma 27B. Evaluation was conducted on the synthetic test dataset (n=7,500), real-world discharge letters (n=30) which are evaluated by physicians and a medical student, and a comparative one-shot approach using open-source models (Qwen3, LLaMA, and GPT-OSS).

Results:

The fine-tuned model achieved high extraction performance across multiple clinical entities, including full ICD diagnosis codes (F1 = 0.84), tumor-related information (0.99), laboratory values (0.99), medication names and dosages (0.99), and ATC medication codes (0.94). Extraction of procedure-related information was more challenging but remained reliable, with F1 scores of 0.63 for OPS codes and 0.90 for procedure descriptions. In a one-shot comparison of general-purpose LLMs with the fine-tuned model, the fine-tuned model consistently outperformed general-purpose LLMs in nearly all extraction categories. When applied to real-world discharge letters, performance remained robust, with F1 scores of 78.9% for ICD diagnoses, 86.1% for tumor-related information, 93% for medications, and 61.3% for procedures.

Conclusions:

These results demonstrate that synthetic text generation from structured clinical data enables effective and scalable training of LLMs for extracting interoperable, multi-entity clinical information from unstructured documentation.


 Citation

Please cite as:

Eryılmaz B, Arzideh K, Bahn M, Damm H, Warmer S, Schäfer H, Idrissi-Yaghir A, Pakull T, Albrecht LJ, Kleesiek J, Lodde G, Friedrich CM, Livingstone E, Schadendorf D, Borys K, Nensa F, Hosch R

Enhancing Healthcare Interoperability Using Large Language Models: A Generative Proof-of-Concept Framework to Extract Medical Information from Unstructured Clinical Text

JMIR Preprints. 29/01/2026:92413

DOI: 10.2196/preprints.92413

URL: https://preprints.jmir.org/preprint/92413

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.