AI Article Synopsis

  • * Results showed high sensitivity for detecting fractures and pneumothorax (100% for both), moderate for pulmonary opacity (AI: 76%, resident: 71%), and lower sensitivity for pulmonary nodules (AI: 33%, resident: 75%).
  • * The AI system also frequently labeled cases as "doubtful," while the resident showed more confidence; overall, the agreement between the AI and the resident was only fair, suggesting limitations of the AI in detecting some important findings.

Article Abstract

Background: The increasing integration of AI in chest X-ray evaluation holds promise for enhancing diagnostic accuracy and optimizing clinical workflows. However, understanding its performance in real-world clinical settings is essential.

Objectives: In this study, we evaluated the sensitivity (Se) and specificity (Sp) of an AI-based software (Arterys MICA v29.4.0) alongside a radiology resident in interpreting chest X-rays referred from the emergency department (ED), using a senior radiologist's assessment as the gold standard (GS). We assessed the concordance between the AI system and the resident, noted the frequency of doubtful cases for each category, identified how many were considered positive by the GS, and assessed variables that AI was not trained to detect.

Methods: We conducted a retrospective observational study analyzing chest X-rays from a sample of 784 patients referred from the ED at our hospital. The AI system was trained to detect five categorical variables-pulmonary nodule, pulmonary opacity, pleural effusion, pneumothorax, and fracture-and assign each a confidence label ("positive", "doubtful", or "negative").

Results: Sensitivity in detecting fractures and pneumothorax was high (100%) for both AI and the resident, moderate for pulmonary opacity (AI = 76%, resident = 71%), and acceptable for pleural effusion (AI = 60%, resident = 67%), with negative predictive values (NPV) above 95% and areas under the curve (AUC) exceeding 0.8. The resident showed moderate sensitivity (75%) for pulmonary nodules, while AI's sensitivity was low (33%). AI assigned a "doubtful" label to some diagnoses, most of which were deemed negative by the GS; the resident expressed doubt less frequently. The Kappa coefficient between the resident and AI was fair (0.3) across most categories, except for pleural effusion, where concordance was moderate (0.5). Our study highlighted additional findings not detected by AI, including 16% prevalence of mediastinal abnormalities, 20% surgical materials, and 20% other pulmonary findings.

Conclusions: Although AI demonstrated utility in identifying most primary findings-except for pulmonary nodules-its high NPV suggests it may be valuable for screening. Further training of the AI software and broadening its scope to identify additional findings could enhance its detection capabilities and increase its applicability in clinical practice.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11592727PMC
http://dx.doi.org/10.3390/diagnostics14222592DOI Listing

Publication Analysis

Top Keywords

pleural effusion
12
referred emergency
8
emergency department
8
resident
8
chest x-rays
8
pulmonary opacity
8
resident moderate
8
additional findings
8
pulmonary
5
diagnostic performance
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!