Performance of AI-powered chatbots in diagnosing acute pulmonary thromboembolism from given clinical vignettes.

Acute Med

M.D. Assoc Prof,Department of Emergency Medicine, Republic of Turkey, Ministry of Health, Sisli Hamidiye Etfal Training and Research Hospital, Istanbul, Turkey.

Published: August 2024

Background: Chatbots hold great potential to serve as support tool in diagnosis and clinical decision process. In this study, we aimed to evaluate the accuracy of chatbots in diagnosing pulmonary embolism (PE). Furthermore, we assessed their performance in determining the PE severity.

Method: 65 case reports meeting our inclusion criteria were selected for this study. Two emergency medicine (EM) physicians crafted clinical vignettes and introduced them to the Bard, Bing, and ChatGPT-3.5 with asking the top 10 diagnoses. After obtaining all differential diagnoses lists, vignettes enriched with supplemental data redirected to the chatbots with asking the severity of PE.

Results: ChatGPT-3.5, Bing, and Bard listed PE within the top 10 diagnoses list with accuracy rates of 92.3%, 92.3%, and 87.6%, respectively. For the top 3 diagnoses, Bard achieved 75.4% accuracy, while ChatGPT and Bing both had 67.7%. As the top diagnosis, Bard, ChatGPT-3.5, and Bing were accurate in 56.9%, 47.7% and 30.8% cases, respectively. Significant differences between Bard and both Bing (p=0.000) and ChatGPT (p=0.007) were noted in this group. Massive PEs were correctly identified with over 85% success rate. Overclassification rates for Bard, ChatGPT-3.5 and Bing at 38.5%, 23.3% and 20%, respectively. Misclassification rates were highest in submassive group.

Conclusion: Although chatbots aren't intended for diagnosis, their high level of diagnostic accuracy and success rate in identifying massive PE underscore the promising potential of chatbots as clinical decision support tool. However, further research with larger patient datasets is required to validate and refine their performance in real-world clinical settings.

Download full-text PDF

Source

Publication Analysis

Top Keywords

top diagnoses
12
chatgpt-35 bing
12
chatbots diagnosing
8
clinical vignettes
8
support tool
8
clinical decision
8
bard bing
8
bard chatgpt-35
8
success rate
8
chatbots
6

Similar Publications

CPSF4-mediated regulation of alternative splicing of HMG20B facilitates the progression of triple-negative breast cancer.

J Transl Med

December 2024

Department of Breast Surgery, College of Medicine, The First Affiliated Hospital, Zhejiang University, Hangzhou, 310000, Zhejiang, China.

Background: Aberrant alternative splicing (AS) contributes to tumor progression. A crucial component of AS is cleavage and polyadenylation specificity factor 4 (CPSF4). It remains unclear whether CPSF4 plays a role in triple-negative breast cancer (TNBC) progression through AS regulation.

View Article and Find Full Text PDF

The prevalence and factors associated with potentially inappropriate medications in Chinese older outpatients with heart failure.

BMC Geriatr

December 2024

Department of Pharmacy, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China.

Background: Potentially inappropriate medications (PIMs) can lead to adverse outcomes. This study aimed to investigate the prevalence of PIMs in older Chinese outpatients with heart failure according to the 2019 Beers criteria and the factors associated with PIMs.

Methods: A cross-sectional retrospective study was conducted using electronic medical data during January 1, 2020 to December 31, 2020 from 9 tertiary medical institutions in Chengdu, China.

View Article and Find Full Text PDF

To report the procedure of an alternative modified transvaginal repair technique (V-NOTES) and their outcomes in apical vesicovaginal fistula. Between January 2020 and January 2023, gynecological procedures resulted in the diagnosis of apical VVFs in 26 patients, 17 of whom had undergone transvaginal repair of apical vesicovaginal fistula via vaginal V-NOTES. Those patients were contacted and followed up.

View Article and Find Full Text PDF

Dengue fever poses a significant public health burden in tropical regions, including Thailand, where periodic epidemics strain healthcare resources. Effective disease surveillance is essential for timely intervention and resource allocation. Various methods exist for spatiotemporal cluster detection, but their comparative performance remains unclear.

View Article and Find Full Text PDF

High-altitude pulmonary edema (HAPE) is a life-threatening altitude sickness afflicting certain individuals after rapid ascent to high altitude above 2500 m. In the setting of HAPE, an early diagnosis is critical and currently based on clinical evaluation. The aim of this study was to utilize the metabolomics to identify the altered metabolic patterns and potential biomarkers for HAPE.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!