Generative artificial intelligence versus clinicians: Who diagnoses multiple sclerosis faster and with greater accuracy?

Mult Scler Relat Disord

The University of Texas Southwestern Medical Center, Department of Neurology, Neuroinnovation Program, Multiple Sclerosis & Neuroimmunology Imaging Program, Dallas, TX, USA; The University of Texas Southwestern Medical Center, Peter O'Donnell Jr. Brain Institute, Dallas, Texas, USA. Electronic address:

Published: October 2024

Background: Those receiving the diagnosis of multiple sclerosis (MS) over the next ten years will predominantly be part of Generation Z (Gen Z). Recent observations within our clinic suggest that younger people with MS utilize online generative artificial intelligence (AI) platforms for personalized medical advice prior to their first visit with a specialist in neuroimmunology. The use of such platforms is anticipated to increase given the technology driven nature, desire for instant communication, and cost-conscious nature of Gen Z. Our objective was to determine if ChatGPT (Generative Pre-trained Transformer) could diagnose MS in individuals earlier than their clinical timeline, and to assess if the accuracy differed based on age, sex, and race/ethnicity.

Methods: People with MS between 18 and 59 years of age were studied. The clinical timeline for people diagnosed with MS was retrospectively identified and simulated using ChatGPT-3.5 (GPT-3.5). Chats were conducted using both actual and derivatives of their age, sex, and race/ethnicity to test diagnostic accuracy. A Kaplan-Meier survival curve was estimated for time to diagnosis, clustered by subject. The p-value testing for differences in time to diagnosis was accomplished using a general Wilcoxon test. Logistic regression (subject-specific intercept) was used to capture intra-subject correlation to test the accuracy prior to and after the inclusion of MRI data.

Results: The study cohort included 100 unique people with MS. Of those, 50 were members of Gen Z (38 female; 22 White; mean age at first symptom was 20.6 years (y) (standard deviation (SD)=2.2y)), and 50 were non-Gen Z (34 female; 27 White; mean age at first symptom was 37.0y (SD=10.4y)). In addition, a total of 529 people that represented digital simulations of the original cohort of 100 people (333 female; 166 White; 136 Black/African American; 107 Asian; 120 Hispanic, mean age at first symptom was 31.6y (SD=12.4y)) were generated allowing for 629 scripted conversations to be analyzed. The estimated median time to diagnosis in clinic was significantly longer at 0.35y (95% CI=[0.28, 0.48]) versus that by ChatGPT at 0.08y (95% CI=[0.04, 0.24]) (p<0.0001). There was no difference in the diagnostic accuracy between ages and by race/ethnicity prior to the inclusion of MRI data. However, prior to including the MRI data, males had a 47% less likely chance of a correct diagnosis relative to females (p=0.05). Post-MRI data inclusion within GPT-3.5, the odds of an accurate diagnosis was 4.0-fold greater for Gen Z participants, relative to non-Gen Z participants (p=0.01) with the diagnostic accuracy being 68% less in males relative to females (p=0.009), and 75% less for White subjects, relative to non-White subjects (p=0.0004).

Conclusion: Although generative AI platforms enable rapid information access and are not principally designed for use in healthcare, an increase in use by Gen Z is anticipated. However, the obtained responses may not be generalizable to all users and bias may exist in select groups.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.msard.2024.105791DOI Listing

Publication Analysis

Top Keywords

time diagnosis
12
age symptom
12
generative artificial
8
artificial intelligence
8
multiple sclerosis
8
clinical timeline
8
age sex
8
female white
8
white age
8
people
6

Similar Publications

In recent years, Brazil's non-White (Brown and Black) population became a numerical majority for the first time since the 19th century. Although we know this change was mostly due to racial reclassification, we do not know how such changes are related to skin color, the primary marker of race in Brazil. Using data from six Latin American Public Opinion Project (LAPOP), or America's Barometer, surveys from 2010 to 2023, we examine how changes in racial self-identification (White, Brown, or Black) are related to respondent skin color (light, medium, or dark).

View Article and Find Full Text PDF

Objective: This study focuses on epidermal growth factor receptor-mutated lung adenocarcinoma, known for frequent brain metastasis. It aimed to compare the clinical outcomes and cost-effectiveness of combining Gamma Knife radiosurgery (GKRS) with tyrosine kinase inhibitors (TKIs) (GKRS+TKI group) versus TKIs alone (TKI group) for the treatment of patients with newly diagnosed brain metastasis in this condition.

Methods: Study characteristics of the two groups were matched using inverse probability of treatment weighting (IPTW).

View Article and Find Full Text PDF

Objective: The natural history of cephaloceles is not well understood. The goal of this study was to better understand the natural history of fetal cephaloceles from prenatal diagnosis to the postnatal period.

Methods: Between January 2013 and April 2023, all patients evaluated with a cephalocele at the Center for Fetal Diagnosis and Treatment were identified.

View Article and Find Full Text PDF

Objective: While the association of a syrinx with a tethered spinal cord in the context of VACTERL (vertebral defects [V], imperforate anus or anal atresia [A], cardiac malformations [C], tracheoesophageal defects [T] with or without esophageal atresia [E], renal anomalies [R], and limb defects [L]) association is known, the incidence of idiopathic syrinxes among these patients has not previously been reported. The authors aimed to characterize the incidence of syrinxes and the pattern of congenital anomalies in pediatric patients with VACTERL association, with a specific focus on the presence of idiopathic syrinxes in this population.

Methods: An institutional database was retrospectively queried for all pediatric patients with VACTERL association.

View Article and Find Full Text PDF

Background: Vestibular schwannoma (VS) is a common intracranial tumor that affects patients' quality of life. Reliable imaging techniques for tumor volume assessment are essential for guiding management decisions. The study aimed to compare the ABC/2 method to the gold standard planimetry method for volumetric assessment of VS.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!