Stage II colon cancer: does ChatGPT recommend more intensive adjuvant therapy? A comparison with MDT decisions.


Birsin Z., Jeral S., Cebeci S., Çerme E., Aliyev V., Günaltılı M., ...Daha Fazla

Future oncology (London, England), ss.1-8, 2025 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası:
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1080/14796694.2025.2610463
  • Dergi Adı: Future oncology (London, England)
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, CINAHL, EMBASE, MEDLINE
  • Sayfa Sayıları: ss.1-8
  • İstanbul Üniversitesi-Cerrahpaşa Adresli: Evet

Özet

Background: Adjuvant chemotherapy decision-making in stage II colon cancer remains challenging. Although multidisciplinary tumor boards (MDTs) guide treatment, their recommendations vary. Artificial intelligence (AI) tools such as ChatGPT may support decision-making, but direct comparative evidence with MDTs is limited.

Methods: We retrospectively analyzed 179 patients with stage II colon cancer who underwent surgery between 2019-2024. MDT recommendations (observation, fluoropyrimidine monotherapy, or oxaliplatin-based chemotherapy) were compared with ChatGPT-5 outputs. Clinical factors - including age, ECOG performance status (PS), tumor stage, minor risk factors, and mismatch repair (MMR) status - were incorporated. Agreement was evaluated using Cohen's kappa (κ) and McNemar's test.

Results: Across the three treatment categories, agreement between MDT and AI was moderate (70.4%, κ = 0.542, p < 0.001), while in the binary comparison of adjuvant therapy versus observation, concordance improved to substantial (91.1%, κ = 0.719, p < 0.001). Discordance mainly reflected AI's tendency to escalate therapy. Agreement decreased in patients ≥70 years, those with ECOG PS 2, and those with multiple risk factors.

Conclusions: AI showed moderate agreement with MDTs in detailed three-category recommendations but substantial concordance in binary adjuvant decisions. While AI may serve as a supportive tool, clinical judgment remains essential, particularly for elderly and frail patients.

Keywords: Adjuvant chemotherapy; artificial intelligence; large language models; multidisciplinary tumor board; stage II colon cancer.

Background: Adjuvant chemotherapy decision-making in stage II colon cancer remains challenging. Although multidisciplinary tumor boards (MDTs) guide treatment, their recommendations vary. Artificial intelligence (AI) tools such as ChatGPT may support decision-making, but direct comparative evidence with MDTs is limited.

Methods: We retrospectively analyzed 179 patients with stage II colon cancer who underwent surgery between 2019-2024. MDT recommendations (observation, fluoropyrimidine monotherapy, or oxaliplatin-based chemotherapy) were compared with ChatGPT-5 outputs. Clinical factors - including age, ECOG performance status (PS), tumor stage, minor risk factors, and mismatch repair (MMR) status - were incorporated. Agreement was evaluated using Cohen's kappa (κ) and McNemar's test.

Results: Across the three treatment categories, agreement between MDT and AI was moderate (70.4%, κ = 0.542, p < 0.001), while in the binary comparison of adjuvant therapy versus observation, concordance improved to substantial (91.1%, κ = 0.719, p < 0.001). Discordance mainly reflected AI's tendency to escalate therapy. Agreement decreased in patients ≥70 years, those with ECOG PS 2, and those with multiple risk factors.

Conclusions: AI showed moderate agreement with MDTs in detailed three-category recommendations but substantial concordance in binary adjuvant decisions. While AI may serve as a supportive tool, clinical judgment remains essential, particularly for elderly and frail patients.

Keywords: Adjuvant chemotherapy; artificial intelligence; large language models; multidisciplinary tumor board; stage II colon cancer.