A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 1034
Function: getPubMedXML

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3152
Function: GetPubMedArticleOutput_2016

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Artificial intelligence in reproductive endocrinology: an in-depth longitudinal analysis of ChatGPTv4's month-by-month interpretation and adherence to clinical guidelines for diminished ovarian reserve. | LitMetric

AI Article Synopsis

  • The study aimed to evaluate the consistency and accuracy of ChatGPTv4's responses to clinical guidelines for Diminished Ovarian Reserve (DOR) over a two-month period using a structured questionnaire.
  • A variety of question types were used (open-ended, multiple-choice, true/false) and responses were rated for accuracy and completeness.
  • Results showed ChatGPTv4 performed exceptionally well, achieving near-perfect accuracy, particularly in true/false questions, with significant improvements in response quality over the evaluation period.

Article Abstract

Objective: To quantitatively assess the performance of ChatGPTv4, an Artificial Intelligence Language Model, in adhering to clinical guidelines for Diminished Ovarian Reserve (DOR) over two months, evaluating the model's consistency in providing guideline-based responses.

Design: A longitudinal study design was employed to evaluate ChatGPTv4's response accuracy and completeness using a structured questionnaire at baseline and at a two-month follow-up.

Setting: ChatGPTv4 was tasked with interpreting DOR questionnaires based on standardized clinical guidelines.

Participants: The study did not involve human participants; the questionnaire was exclusively administered to the ChatGPT model to generate responses about DOR.

Methods: A guideline-based questionnaire with 176 open-ended, 166 multiple-choice, and 153 true/false questions were deployed to rigorously assess ChatGPTv4's ability to provide accurate medical advice aligned with current DOR clinical guidelines. AI-generated responses were rated on a 6-point Likert scale for accuracy and a 3-point scale for completeness. The two-phase design assessed the stability and consistency of AI-generated answers over two months.

Results: ChatGPTv4 achieved near-perfect scores across all question types, with true/false questions consistently answered with 100% accuracy. In multiple-choice queries, accuracy improved from 98.2 to 100% at the two-month follow-up. Open-ended question responses exhibited significant positive enhancements, with accuracy scores increasing from an average of 5.38 ± 0.71 to 5.74 ± 0.51 (max: 6.0) and completeness scores from 2.57 ± 0.52 to 2.85 ± 0.36 (max: 3.0). It underscored the improvements as significant (p < 0.001), with positive correlations between initial and follow-up accuracy (r = 0.597) and completeness (r = 0.381) scores.

Limitations: The study was limited by the reliance on a controlled, albeit simulated, setting that may not perfectly mirror real-world clinical interactions.

Conclusion: ChatGPTv4 demonstrated exceptional and improving accuracy and completeness in handling DOR-related guideline queries over the studied period. These findings highlight ChatGPTv4's potential as a reliable, adaptable AI tool in reproductive endocrinology, capable of augmenting clinical decision-making and guideline development.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12020-024-04031-8DOI Listing

Publication Analysis

Top Keywords

clinical guidelines
12
artificial intelligence
8
guidelines diminished
8
diminished ovarian
8
ovarian reserve
8
true/false questions
8
accuracy
5
intelligence reproductive
4
reproductive endocrinology
4
endocrinology in-depth
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!