TY - GEN
T1 - Latent COVID-19 clusters in patients with chronic respiratory conditions
AU - Cui, Wanting
AU - Cabrera, Manuel
AU - Finkelstein, Joseph
N1 - Publisher Copyright:
© 2020 The European Federation for Medical Informatics (EFMI) and IOS Press.
PY - 2020/11/23
Y1 - 2020/11/23
N2 - The goal of this paper was to apply unsupervised machine learning techniques towards the discovery of latent COVID-19 clusters in patients with chronic lower respiratory diseases (CLRD). Patients who underwent testing for SARS-CoV-2 were identified from electronic medical records. The analytical dataset comprised 2,328 CLRD patients of whom 1,029 were tested COVID-19 positive. We used the factor analysis for mixed data method for preprocessing. It performed principle component analysis on numeric values and multiple correspondence analysis on categorical values which helped convert categorical data into numeric. Cluster analysis was an effective means to both distinguish subgroups of CLRD patients with COVID-19 as well as identify patient clusters which were adversely affected by the infection. Age, comorbidity index and race were important factors for cluster separations. Furthermore, diseases of the circulatory system, the nervous system and sense organs, digestive system, genitourinary system, metabolic diseases and immunity disorders were also important criteria in the resulting cluster analyses.
AB - The goal of this paper was to apply unsupervised machine learning techniques towards the discovery of latent COVID-19 clusters in patients with chronic lower respiratory diseases (CLRD). Patients who underwent testing for SARS-CoV-2 were identified from electronic medical records. The analytical dataset comprised 2,328 CLRD patients of whom 1,029 were tested COVID-19 positive. We used the factor analysis for mixed data method for preprocessing. It performed principle component analysis on numeric values and multiple correspondence analysis on categorical values which helped convert categorical data into numeric. Cluster analysis was an effective means to both distinguish subgroups of CLRD patients with COVID-19 as well as identify patient clusters which were adversely affected by the infection. Age, comorbidity index and race were important factors for cluster separations. Furthermore, diseases of the circulatory system, the nervous system and sense organs, digestive system, genitourinary system, metabolic diseases and immunity disorders were also important criteria in the resulting cluster analyses.
KW - COVID-19
KW - Chronic lower respiratory diseases
KW - Cluster analysis
UR - http://www.scopus.com/inward/record.url?scp=85096733250&partnerID=8YFLogxK
U2 - 10.3233/SHTI200689
DO - 10.3233/SHTI200689
M3 - Conference contribution
C2 - 33227735
AN - SCOPUS:85096733250
T3 - Studies in Health Technology and Informatics
SP - 32
EP - 36
BT - Integrated Citizen Centered Digital Health and Social Care
A2 - Varri, Alpo
A2 - Delgado, Jaime
A2 - Gallos, Parisis
A2 - Hagglund, Maria
A2 - Hayrinen, Kristiina
A2 - Kinnunen, Ulla-Mari
A2 - Pape-Haugaard, Louise B.
A2 - Peltonen, Laura-Maria
A2 - Saranto, Kaija
A2 - Scott, Philip
PB - IOS Press BV
T2 - EFMI 2020 Special Topic Conference on Integrated Citizen Centered Digital Health and Social Care: Citizens as Data Producers and Service co-Creators
Y2 - 26 November 2020 through 27 November 2020
ER -