Systematic review of validated case definitions for diabetes in ICD-9-coded and ICD-10-coded data in adult populations

Bushra Khokhar, Nathalie Jette, Amy Metcalfe, Ceara Tess Cunningham, Hude Quan, Gilaad G. Kaplan, Sonia Butalia, Doreen Rabi

Research output: Contribution to journalArticlepeer-review

82 Scopus citations


Objectives With steady increases in 'big data' and data analytics over the past two decades, administrative health databases have become more accessible and are now used regularly for diabetes surveillance. The objective of this study is to systematically review validated International Classification of Diseases (ICD)-based case definitions for diabetes in the adult population. Setting, participants and outcome measures Electronic databases, MEDLINE and Embase, were searched for validation studies where an administrative case definition (using ICD codes) for diabetes in adults was validated against a reference and statistical measures of the performance reported. Results The search yielded 2895 abstracts, and of the 193 potentially relevant studies, 16 met criteria. Diabetes definition for adults varied by data source, including physician claims (sensitivity ranged from 26.9% to 97%, specificity ranged from 94.3% to 99.4%, positive predictive value (PPV) ranged from 71.4% to 96.2%, negative predictive value (NPV) ranged from 95% to 99.6% and ΰ ranged from 0.8 to 0.9), hospital discharge data (sensitivity ranged from 59.1% to 92.6%, specificity ranged from 95.5% to 99%, PPV ranged from 62.5% to 96%, NPV ranged from 90.8% to 99% and ΰ ranged from 0.6 to 0.9) and a combination of both (sensitivity ranged from 57% to 95.6%, specificity ranged from 88% to 98.5%, PPV ranged from 54% to 80%, NPV ranged from 98% to 99.6% and ΰ ranged from 0.7 to 0.8). Conclusions Overall, administrative health databases are useful for undertaking diabetes surveillance, but an awareness of the variation in performance being affected by case definition is essential. The performance characteristics of these case definitions depend on the variations in the definition of primary diagnosis in ICD-coded discharge data and/or the methodology adopted by the healthcare facility to extract information from patient records.

Original languageEnglish
Article numbere009952
JournalBMJ Open
Issue number8
StatePublished - 1 Aug 2016
Externally publishedYes


  • administrative data
  • case definition
  • diabetes
  • validation studies


Dive into the research topics of 'Systematic review of validated case definitions for diabetes in ICD-9-coded and ICD-10-coded data in adult populations'. Together they form a unique fingerprint.

Cite this