NLP-Assisted Pipeline for COVID-19 Core Outcome Set Identification Using ClinicalTrials.gov

Fatemeh Shah-Mohammadi, Irena Parvanova, Joseph Finkelstein

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Core outcome sets (COS) are necessary to ensure the systematic collection, metadata analysis and sharing the information across studies. However, development of an area-specific clinical research is costly and time consuming. ClinicalTrials.gov, as a public repository, provides access to a vast collection of clinical trials and their characteristics such as primary outcomes. With the growing number of COVID-19 clinical trials, identifying COSs from outcomes of such trials is crucial. This paper introduces a semi-automatic pipeline that can efficiently identify, aggregate and rank the COS from the primary outcomes of COVID-19 clinical trials. Using Natural language processing (NLP) techniques, our proposed pipeline successfully downloads and processes 5090 trials from all over the world and identifies COVID-19-specific outcomes that appeared in more than 1% of the trials. The top-of-the-list outcomes identified by the pipeline are mortality due to COVID-19, COVID-19 infection rate and COVID-19 symptoms.

Original languageEnglish
Title of host publicationMEDINFO 2021
Subtitle of host publicationOne World, One Health - Global Partnership for Digital Innovation - Proceedings of the 18th World Congress on Medical and Health Informatics
EditorsPaula Otero, Philip Scott, Susan Z. Martin, Elaine Huesing
PublisherIOS Press BV
Pages622-626
Number of pages5
ISBN (Electronic)9781643682648
DOIs
StatePublished - 6 Jun 2022
Event18th World Congress on Medical and Health Informatics: One World, One Health - Global Partnership for Digital Innovation, MEDINFO 2021 - Virtual, Online
Duration: 2 Oct 20214 Oct 2021

Publication series

NameStudies in Health Technology and Informatics
Volume290
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365

Conference

Conference18th World Congress on Medical and Health Informatics: One World, One Health - Global Partnership for Digital Innovation, MEDINFO 2021
CityVirtual, Online
Period2/10/214/10/21

Keywords

  • COVID-19
  • Core outcome set
  • Natural language processing (NLP)

Fingerprint

Dive into the research topics of 'NLP-Assisted Pipeline for COVID-19 Core Outcome Set Identification Using ClinicalTrials.gov'. Together they form a unique fingerprint.

Cite this