Objectives: Artificial intelligence companies have been increasing their initiatives recently to improve the results of chatbots, which are software programs that can converse with a human in natural language. The role of chatbots in health care is deemed worthy of research. OpenAI's ChatGPT is a supervised and empowered machine learning-based chatbot. The aim of this study was to determine the performance of ChatGPT in emergency medicine (EM) triage prediction.

Methods: This was a preliminary, cross-sectional study conducted with case scenarios generated by the researchers based on the emergency severity index (ESI) handbook v4 cases. Two independent EM specialists who were experts in the ESI triage scale determined the triage categories for each case. A third independent EM specialist was consulted as arbiter, if necessary. Consensus results for each case scenario were assumed as the reference triage category. Subsequently, each case scenario was queried with ChatGPT and the answer was recorded as the index triage category. Inconsistent classifications between the ChatGPT and reference category were defined as over-triage (false positive) or under-triage (false negative).

Results: Fifty case scenarios were assessed in the study. Reliability analysis showed a fair agreement between EM specialists and ChatGPT (Cohen's Kappa: 0.341). Eleven cases (22%) were over triaged and 9 (18%) cases were under triaged by ChatGPT. In 9 cases (18%), ChatGPT reported two consecutive triage categories, one of which matched the expert consensus. It had an overall sensitivity of 57.1% (95% confidence interval [CI]: 34-78.2), specificity of 34.5% (95% CI: 17.9-54.3), positive predictive value (PPV) of 38.7% (95% CI: 21.8-57.8), negative predictive value (NPV) of 52.6 (95% CI: 28.9-75.6), and an F1 score of 0.461. In high acuity cases (ESI-1 and ESI-2), ChatGPT showed a sensitivity of 76.2% (95% CI: 52.8-91.8), specificity of 93.1% (95% CI: 77.2-99.2), PPV of 88.9% (95% CI: 65.3-98.6), NPV of 84.4 (95% CI: 67.2-94.7), and an F1 score of 0.821. The receiver operating characteristic curve showed an area under the curve of 0.846 (95% CI: 0.724-0.969, < 0.001) for high acuity cases.

Conclusion: The performance of ChatGPT was best when predicting high acuity cases (ESI-1 and ESI-2). It may be useful when determining the cases requiring critical care. When trained with more medical knowledge, ChatGPT may be more accurate for other triage category predictions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10389099PMC
http://dx.doi.org/10.4103/tjem.tjem_79_23DOI Listing

Publication Analysis

Top Keywords

triage category
12
high acuity
12
chatgpt
11
95%
9
triage
8
natural language
8
cross-sectional study
8
performance chatgpt
8
case scenarios
8
triage categories
8

Similar Publications

Background: COVID-19 caused a huge backlog of patients in glaucoma clinics. This study describes redesign of an entire glaucoma service with electronic patient triage to three levels and utilisation of the Scottish optometry infrastructure of upskilled optometrists.

Methods: 2276 patients in glaucoma clinics were identified and triaged to three levels in keeping with Glauc-strat-fast guidance with local amendments.

View Article and Find Full Text PDF

Incorporating molecular testing for human papillomavirus (HPV) into the screening of cervical specimens can improve risk stratification and, in turn, patient management. Infection with a high-risk (HR) HPV genotype is associated with greater risk for persistent infection, viral integration, and progression of cervical neoplasia. Current guidelines consider HPV 16 or HPV 18 clinically actionable with referral to colposcopy; however, 12 Other HR HPV genotypes have been associated with cervical cancer risk, suggesting a benefit of extended genotyping.

View Article and Find Full Text PDF

Background: First Nations patients often experience poorer health outcomes than non-First Nations patients. Despite emergency triage primarily focusing on severity, implying comparable outcomes for patients in the same triage group regardless of demographics, the precision of triage for First-Nations Australians may be undermined by multiple factors, although research in this area is scarce.

Objective: To compare admission rates, service utilisation and mortality for First Nations and non-First Nations patients, based on their triage categories.

View Article and Find Full Text PDF

Background: The COVID-19 pandemic has led governments worldwide to make ethically controversial decisions. As a result, healthcare professionals are facing several ethical dilemmas, especially in terms of healthcare services provided to senior citizens. Thus, the aim of this review is to identify and categorize ethical dilemmas as well as propose solutions regarding health care services for elderly individuals.

View Article and Find Full Text PDF

Background And Importance: Access to healthcare remains a persistent challenge. Socially disadvantaged populations often encounter barriers to care and may frequently seek out emergency departments (EDs), including for nonurgent medical care.

Objective: The objective of this study is to study the association between nonurgent presentations to pediatric EDs and patients' socioeconomic environment in an urban setting.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!