Training Large Language Models with Clinical Data: Challenges and Future Directions

Authors

  • Tanmay Shukla MS, Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA

Keywords:

Large language models (LLMs), Clinical data, Healthcare AI, Privacy-preserving techniques, Model interpretability, Ethical AI in healthcare, Federated learning, Data standardization and Explainable AI (XAI)

Abstract

LLMs show high performance in multiple fields and applications and could benefit healthcare for better patient management, prediction and decision support. However, the sensitivity and complexity of healthcare data makes it challenging to use clinical data in these models. Here, we examine these issues with reference to the four domains of data privacy, model interpretability, technical limitations and ethical implications; with a view to their relevance in relation to applications in healthcare. We examine the current best practices, suggest approaches for safe data exchange, transparent model interpretation and domain-specific training processes in real clinical settings. Finally, we outline future research directions to help develop LLMs for clinical use that protect patient privacy, which we argue can only be achieved with strong interdisciplinary collaboration and regulation of clinical data use. We hope that our results will help scholars, policymakers, and clinicians navigate toward ethical and efficient solutions for utilizing LLMs in healthcare.

Downloads

Download data is not yet available.

Downloads

Published

03-12-2024

How to Cite

[1]
“Training Large Language Models with Clinical Data: Challenges and Future Directions”, J. of Art. Int. Research, vol. 4, no. 2, pp. 78–112, Dec. 2024, Accessed: Mar. 07, 2026. [Online]. Available: https://www.thesciencebrigade.org/JAIR/article/view/521