Using big data analytics to identify dentists with frequent future malpractice claims

Wanting Cui, Joseph Finkelstein

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations


Healthcare spending has been growing at an increasing rate in the US, due in part to medical malpractice costs. Dental malpractice is an area that has not been studied in depth. Using National Practitioner Data Bank (NPDB), we explored the extent of dental malpractice claims and sought to construct a predictive model that can help us identify dental practitioners at risk of performing medical malpractice. Over 1,500 dental malpractice claims were reported annually, and over $1.7 billion being paid out by medical malpractice insurers over the past 15 years. Majority of claims resulted in minor injuries, and the number of major injury claims increased over years. In prediction, we randomly split the data into train (75%) and test (25%) datasets. We trained and tuned models using 5-fold cross validation on the training set. Then, we fitted the model on the test data for performance measures. We used Logistic Regression, Random Forest (RF) and XGBoost and tuned the hypermeters of models accordingly through grid search and cross validation. XGBoost was the best machine learning model to predict the risk of dentists having several malpractice reports. The best performing model had an accuracy of 72.8% with 30.6% F1 score. The NPDB database is a valuable dataset to study dental malpractice claims. Further analysis of information extracted from this dataset is warranted.

Original languageEnglish
Title of host publicationDigital Personalized Health and Medicine - Proceedings of MIE 2020
EditorsLouise B. Pape-Haugaard, Christian Lovis, Inge Cort Madsen, Patrick Weber, Per Hostrup Nielsen, Philip Scott
PublisherIOS Press
Number of pages5
ISBN (Electronic)9781643680828
StatePublished - 16 Jun 2020
Event30th Medical Informatics Europe Conference, MIE 2020 - Geneva, Switzerland
Duration: 28 Apr 20201 May 2020

Publication series

NameStudies in Health Technology and Informatics
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365


Conference30th Medical Informatics Europe Conference, MIE 2020


  • Big Data Analytics
  • Data Science
  • Machine Learning
  • Predictive Model


Dive into the research topics of 'Using big data analytics to identify dentists with frequent future malpractice claims'. Together they form a unique fingerprint.

Cite this