Reducing Diagnostic Uncertainty Using Large Language Models

Joseph Finkelstein, Wanting Cui, Keaton Morgan, Kensaku Kawamoto

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

The study used ClinicalBERT to predict body system categories based on clinical notes from the first three days of admission, using the MIMIC-III dataset. After data preprocessing, including the extraction of admission details, clinical notes, and diagnoses, the dataset comprised 510,956 notes associated with 44,270 unique hospitalizations. Discharge diagnoses were categorized into body systems, and the ClinicalBERT model was fine-tuned to predict associations with these diagnoses, resulting in 19 classification models - one for each body system. Around 80% of the models achieved F1 scores exceeding 0.7. Models for diseases of the circulatory, infectious and parasitic, respiratory, nervous, digestive, and genitourinary systems had F1 scores surpassing 0.8. Conversely, models for congenital malformations, eye and adnexa diseases, and ear and mastoid process diseases showed notably lower F1 scores. To explore model robustness, a comparison between three days and one day of notes per patient was conducted. While F1 scores generally decreased, a significant finding was that most body system models maintained satisfactory performance due to the statistical distribution similarities in note types and lengths between one and three days. This suggests the potential for ClinicalBERT's adaptability to varied data availability scenarios. Future studies could delve into developing a multiple notes model, testing its flexibility and robustness across different prediction durations, thereby potentially reducing the time and effort associated with model implementation in diverse clinical settings.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 1st International Conference on Artificial Intelligence for Medicine, Health and Care, AIMHC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages236-242
Number of pages7
ISBN (Electronic)9798350371987
DOIs
StatePublished - 2024
Externally publishedYes
Event1st IEEE International Conference on Artificial Intelligence for Medicine, Health and Care, AIMHC 2024 - Hybrid, Laguna Hills, United States
Duration: 5 Feb 20247 Feb 2024

Publication series

NameProceedings - 2024 IEEE 1st International Conference on Artificial Intelligence for Medicine, Health and Care, AIMHC 2024

Conference

Conference1st IEEE International Conference on Artificial Intelligence for Medicine, Health and Care, AIMHC 2024
Country/TerritoryUnited States
CityHybrid, Laguna Hills
Period5/02/247/02/24

Keywords

  • Large Language Models
  • diagnostic uncertainty
  • emergency department

Fingerprint

Dive into the research topics of 'Reducing Diagnostic Uncertainty Using Large Language Models'. Together they form a unique fingerprint.

Cite this