What question did this study set out to answer?

January 25, 2026

Large language models improve transferability of electronic health record-based predictions across countries and coding systems.

Key Points

The aim is to enhance the transferability of electronic health record-based predictions across different healthcare systems.
Developed GRASP, a model leveraging embeddings from large language models.
Trained on UK Biobank data and evaluated in FinnGen and Mount Sinai datasets.
Applied GRASP to predict the onset of 21 diseases and all-cause mortality.
GRASP achieved an average ΔC-index that was 88% and 47% higher than language-unaware models in UK and FinnGen datasets, respectively.
Showed significantly higher correlations with polygenic risk scores for 62% of diseases.
Maintained robust performance even with unharmonized datasets.

Abstract

Variation in medical practices and reporting standards across healthcare systems limits the transferability of prediction models based on structured electronic health record data. Prior studies have demonstrated that embedding medical codes into a shared semantic space can help address these discrepancies, but real-world applications remain limited. Here, we show that leveraging embeddings from a large language model alongside a transformer-based prediction model provides an effective and scalable solution to enhance generalizability. We call this approach GRASP and apply it to predict the onset of 21 diseases and all-cause mortality in over one million individuals. Trained on the UK Biobank (UK) and evaluated in FinnGen (Finland) and Mount Sinai (USA), GRASP achieved an average ΔC-index that was 88% and 47% higher than language-unaware models, respectively. GRASP also showed significantly higher correlations with polygenic risk scores for 62% of diseases, and maintained robust performance even when datasets were not harmonized to the same data model.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Kirchler et al. (Thu,) studied this question.

www.synapsesocial.com/papers/6975b2c8feba4585c2d6e4ff — DOI: https://doi.org/10.1038/s41746-026-02363-5

Authors

Matthias Kirchler

Matteo Ferro

Veronica Lorenzini

Actions

Institutions

Massachusetts General Hospital

Icahn School of Medicine at Mount Sinai

University of Helsinki

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Large language models improve transferability of electronic health record-based predictions across countries and coding systems.

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion