Machine learning characterization of a rare neurologic disease via electronic health records: a proof-of-principle study on stiff person syndrome

Soo Hwan Park, Seo Ho Song, Frederick Burton, Cybèle Arsan, Barbara Jobst, Mary Feldman

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Background: Despite the frequent diagnostic delays of rare neurologic diseases (RND), it remains difficult to study RNDs and their comorbidities due to their rarity and hence the statistical underpowering. Affecting one to two in a million annually, stiff person syndrome (SPS) is an RND characterized by painful muscle spasms and rigidity. Leveraging underutilized electronic health records (EHR), this study showcased a machine-learning-based framework to identify clinical features that optimally characterize the diagnosis of SPS. Methods: A machine-learning-based feature selection approach was employed on 319 items from the past medical histories of 48 individuals (23 with a diagnosis of SPS and 25 controls) with elevated serum autoantibodies against glutamic-acid-decarboxylase-65 (anti-GAD65) in Dartmouth Health’s EHR to determine features with the highest discriminatory power. Each iteration of the algorithm implemented a Support Vector Machine (SVM) model, generating importance scores—SHapley Additive exPlanation (SHAP) values—for each feature and removing one with the least salient. Evaluation metrics were calculated through repeated stratified cross-validation. Results: Depression, hypothyroidism, GERD, and joint pain were the most characteristic features of SPS. Utilizing these features, the SVM model attained precision of 0.817 (95% CI 0.795–0.840), sensitivity of 0.766 (95% CI 0.743–0.790), F-score of 0.761 (95% CI 0.744–0.778), AUC of 0.808 (95% CI 0.791–0.825), and accuracy of 0.775 (95% CI 0.759–0.790). Conclusions: This framework discerned features that, with further research, may help fully characterize the pathologic mechanism of SPS: depression, hypothyroidism, and GERD may respectively represent comorbidities through common inflammatory, genetic, and dysautonomic links. This methodology could address diagnostic challenges in neurology by uncovering latent associations and generating hypotheses for RNDs.

Original languageEnglish
Article number272
JournalBMC Neurology
Volume24
Issue number1
DOIs
StatePublished - Dec 2024
Externally publishedYes

Keywords

  • Electronic health records
  • Machine learning
  • Rare neurologic disease
  • Stiff person syndrome

Fingerprint

Dive into the research topics of 'Machine learning characterization of a rare neurologic disease via electronic health records: a proof-of-principle study on stiff person syndrome'. Together they form a unique fingerprint.

Cite this