Synthia

  • Login
  • Subscribe
  • Connect
...
  1. Home
  2. Impact
  3. Publications
  4. A Benchmark of Large Language Models for Semantic Harmonization of Alzheimer's Disease Cohorts
  • SYN-Y1-2025-014: Journal publication [Adams et al., Fraunhofer]
  • The Journal of Prevention of Alzheimer's Disease, January 2026. Read here> 
  • The study addresses the challenge of harmonizing heterogeneous healthcare datasets, where inconsistent variable naming limits scalable multi-cohort Alzheimer's disease research. Because manual harmonization is resource-intensive, the authors assess whether modern text-embedding models can support this task. They develop a new benchmark that tests five state-of-the-art embedding models across seven Alzheimer’s disease datasets by mapping cohort metadata to a Common Data Model, using only semantic descriptions of clinical, lifestyle, demographic, and imaging variables. Results show that models performing well on general benchmarks do not necessarily excel in real-world clinical harmonization, highlighting the need for domain-specific evaluation. The authors also provide guidelines for metadata formatting and release an open-source library and interactive leaderboard to support ongoing benchmarking. The work emphasizes the importance of tailored standards to enable semi-automated clinical data harmonization.

Related Impact Highlights

Discover outputs and activities connected to this use case, highlighting the research, collaboration, and dissemination efforts driving progress in SYNTHIA.

Alzheimer’s Disease

Alzheimer’s Disease

Use Cases

      Synthia

      Synthetic Data Generation framework for integrated validation of use cases and Al healthcare applications.

      This project is supported by the Innovative Health Initiative Joint Undertaking (IHI JU) under grant agreement No 101172872. The JU receives support from the European Union's Horizon Europe research and innovation programme, COCIR, EFPIA, Europa Bío, MedTech Europe, Vaccines Europe and DNV. The UK consortium partner, The National Institute for Health and Care Excellence (NICE) is supported by UKRI Grant 10132181.

      © SYNTHIA 2025. Legal Notices.

      Follow us

       

      ihi-synthiaihi-synthia@ihi-synthia@ihi-synthia@IHI_SYNTHIA@IHI_SYNTHIA@ihi-synthia@ihi-synthiaSYNTHIA RewiredSYNTHIA Rewired

      Contact us

       

      www.ihi-synthia.eu

      contact@ihi-synthia.eu