A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 1034
Function: getPubMedXML

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3152
Function: GetPubMedArticleOutput_2016

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

The Role of Artificial Intelligence in Endocrine Management: Assessing ChatGPT's Responses to Prolactinoma Queries. | LitMetric

AI Article Synopsis

  • This research explores how well ChatGPT can answer patient questions about hyperprolactinemia and prolactinoma, analyzing 46 common queries.
  • Responses were evaluated for accuracy and adequacy by two endocrinologists using established scales, with median scores showing high accuracy but lower adequacy in specific topics like pregnancy.
  • Overall, while ChatGPT performed well, it struggled with certain areas, highlighting the need for improvements in medical information delivery.

Article Abstract

This research investigates the utility of Chat Generative Pre-trained Transformer (ChatGPT) in addressing patient inquiries related to hyperprolactinemia and prolactinoma. A set of 46 commonly asked questions from patients with prolactinoma were presented to ChatGPT and responses were evaluated for accuracy with a 6-point Likert scale (1: completely inaccurate to 6: completely accurate) and adequacy with a 5-point Likert scale (1: completely inadequate to 5: completely adequate). Two independent endocrinologists assessed the responses, based on international guidelines. Questions were categorized into groups including general information, diagnostic process, treatment process, follow-up, and pregnancy period. The median accuracy score was 6.0 (IQR, 5.4-6.0), and the adequacy score was 4.5 (IQR, 3.5-5.0). The lowest accuracy and adequacy score assigned by both evaluators was two. Significant agreement was observed between the evaluators, demonstrated by a weighted κ of 0.68 ( = 0.08) for accuracy and a κ of 0.66 ( = 0.04) for adequacy. The Kruskal-Wallis tests revealed statistically significant differences among the groups for accuracy ( = 0.005) and adequacy ( = 0.023). The pregnancy period group had the lowest accuracy score and both pregnancy period and follow-up groups had the lowest adequacy score. In conclusion, ChatGPT demonstrated commendable responses in addressing prolactinoma queries; however, certain limitations were observed, particularly in providing accurate information related to the pregnancy period, emphasizing the need for refining its capabilities in medical contexts.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11051052PMC
http://dx.doi.org/10.3390/jpm14040330DOI Listing

Publication Analysis

Top Keywords

pregnancy period
16
adequacy score
12
prolactinoma queries
8
likert scale
8
scale completely
8
accuracy score
8
score iqr
8
lowest accuracy
8
accuracy
6
adequacy
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!