Accuracy of Chatbots in Citing Journal Articles.

JAMA Netw Open

Learning Health Community, Palo Alto, California.

Published: August 2023

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10410472	PMC
http://dx.doi.org/10.1001/jamanetworkopen.2023.27647	DOI Listing

Publication Analysis

Top Keywords

accuracy chatbots

chatbots citing

citing journal

journal articles

accuracy

citing

journal

articles

Similar Publications

Pediatric Supracondylar Humerus and Diaphyseal Femur Fractures: A Comparative Analysis of Chat Generative Pretrained Transformer and Google Gemini Recommendations Versus American Academy of Orthopaedic Surgeons Clinical Practice Guidelines.

J Pediatr Orthop

January 2025

Department of Pediatric Orthopaedic Surgery, Hospital for Special Surgery, New York, NY.

Patrick P Nian Amith Umesh Shae K Simpson Olivia C Tracey Erikson Nichols

Objective: Artificial intelligence (AI) chatbots, including chat generative pretrained transformer (ChatGPT) and Google Gemini, have significantly increased access to medical information. However, in pediatric orthopaedics, no study has evaluated the accuracy of AI chatbots compared with evidence-based recommendations, including the American Academy of Orthopaedic Surgeons clinical practice guidelines (AAOS CPGs). The aims of this study were to compare responses by ChatGPT-4.

View Article and Find Full Text PDF

Similar Publications

Evaluation of different artificial intelligence applications in responding to regenerative endodontic procedures.

BMC Oral Health

January 2025

Department of Endodontics, Faculty of Dentistry, Marmara University, Başıbüyük, Başıbüyük Yolu Marmara Üniversitesi Başıbüyük Sağlık Yerleşkesi 9/3, Başıbüyük - Maltepe, PO Box: 34854, İstanbul, Turkey.

Ece Ekmekci Parla Meva Durmazpinar

Introduction: The integration of artificial intelligence (AI) technologies in healthcare is revolutionizing the workflows of healthcare professionals, enabling faster and more accurate patient treatment. This study aims to evaluate the accuracy of responses provided by different AI chatbots to questions that dentists might ask regarding regenerative endodontic treatment (RET), a procedure that shows promising biological healing potential.

Methods: A total of 23 questions related to RET procedures were developed based on the American Association of Endodontists (AAE) 2022 guidelines.

View Article and Find Full Text PDF

Similar Publications

Evaluation of a context-aware chatbot using retrieval-augmented generation for answering clinical questions on medication-related osteonecrosis of the jaw.

J Craniomaxillofac Surg

January 2025

Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany.

David Steybe Philipp Poxleitner Suad Aljohani Bente Brokstad Herlofson Ourania Nicolatou-Galitis

The potential of large language models (LLMs) in medical applications is significant, and Retrieval-augmented generation (RAG) can address the weaknesses of these models in terms of data transparency and scientific accuracy by incorporating current scientific knowledge into responses. In this study, RAG and GPT-4 by OpenAI were applied to develop GuideGPT, a context aware chatbot integrated with a knowledge database from 449 scientific publications designed to provide answers on the prevention, diagnosis, and treatment of medication-related osteonecrosis of the jaw (MRONJ). A comparison was made with a generic LLM ("PureGPT") across 30 MRONJ-related questions.

View Article and Find Full Text PDF

Similar Publications

Comparative Evaluation of Chatbot Responses on Coronary Artery Disease.

Turk Kardiyol Dern Ars

January 2025

Department of Cardiology, Dr Siyami Ersek Thoracic and Cardiovascular Surgery Training Hospital, İstanbul, Türkiye.

Levent Pay Ahmet Çağdaş Yumurtaş Tuğba Çetin Tufan Çınar Mert İlker Hayıroğlu

Objective: Coronary artery disease (CAD) is the leading cause of morbidity and mortality globally. The growing interest in natural language processing chatbots (NLPCs) has driven their inevitable widespread adoption in healthcare. The purpose of this study was to evaluate the accuracy and reproducibility of responses provided by NLPCs, such as ChatGPT, Gemini, and Bing, to frequently asked questions about CAD.

View Article and Find Full Text PDF

Similar Publications

Evaluation of LLMs accuracy and consistency in the registered dietitian exam through prompt engineering and knowledge retrieval.

Sci Rep

January 2025

Department of Engineering, iHealth Labs, Sunnyvale, CA, 94085, United States.

Iman Azimi Mohan Qi Li Wang Amir M Rahmani Youlin Li

Large language models (LLMs) are fundamentally transforming human-facing applications in the health and well-being domains: boosting patient engagement, accelerating clinical decision-making, and facilitating medical education. Although state-of-the-art LLMs have shown superior performance in several conversational applications, evaluations within nutrition and diet applications are still insufficient. In this paper, we propose to employ the Registered Dietitian (RD) exam to conduct a standard and comprehensive evaluation of state-of-the-art LLMs, GPT-4o, Claude 3.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!