Purpose: Despite advanced technologies in breast cancer management, challenges remain in efficiently interpreting vast clinical data for patient-specific insights. We reviewed the literature on how large language models (LLMs) such as ChatGPT might offer solutions in this field. Methods: We searched MEDLINE for relevant studies published before December 22, 2023. Keywords included: “large language models”, “LLM”, “GPT”, “ChatGPT”, “OpenAI”, and “breast”. The risk bias was evaluated using the QUADAS-2 tool. Results: Six studies evaluating either ChatGPT-3.5 or GPT-4, met our inclusion criteria. They explored clinical notes analysis, guideline-based question-answering, and patient management recommendations. Accuracy varied between studies, ranging from 50 to 98%. Higher accuracy was seen in structured tasks like information retrieval. Half of the studies used real patient data, adding practical clinical value. Challenges included inconsistent accuracy, dependency on the way questions are posed (prompt-dependency), and in some cases, missing critical clinical information. Conclusion: LLMs hold potential in breast cancer care, especially in textual information extraction and guideline-driven clinical question-answering. Yet, their inconsistent accuracy underscores the need for careful validation of these models, and the importance of ongoing supervision.

Original languageEnglish
Article number140
JournalJournal of Cancer Research and Clinical Oncology
Issue number3
StatePublished - Mar 2024


  • Artificial intelligence
  • Breast cancer
  • GPT
  • Large language models


Dive into the research topics of 'Utilizing large language models in breast cancer management: systematic review'. Together they form a unique fingerprint.

Cite this