Australian Endodontic Journal, 2026 (SCI-Expanded, Scopus)
This study evaluated and compared the accuracy, consistency, readability, and information quality of three LLM-based chatbots, namely ChatGPT-5, Claude AI (Sonnet 4.0), and Perplexity (Mistral Large 2), in addressing traumatic dental injury questions. Forty true/false statements were submitted to each chatbot three times at weekly intervals to assess accuracy and consistency. Additionally, chatbot responses for 25 open-ended case-based questions were evaluated for readability, understandability and actionability, information reliability and quality. For true/false questions, Perplexity showed the highest accuracy, followed by Claude and ChatGPT. For open-ended responses, ChatGPT excelled in readability (FRE: 62.4 ± 7.6), Perplexity in understandability (91.0 ± 4.3) and actionability (93.0 ± 6.4) and Claude in information reliability (mDISCERN total: 61.2; no variability observed). All chatbots achieved acceptable global quality scores (> 4.4). These findings emphasise the complementary role of chatbots in dental trauma management. Tool selection should be based on intended use, while continued human oversight remains essential in clinical decision-making.