Background: The integration of artificial intelligence (AI) into medicine is growing, with some experts predicting its standalone use soon. However, skepticism remains due to limited positive outcomes from independent validations. This research evaluates AI software's effectiveness in analyzing chest X-rays (CXR) to identify lung nodules, a possible lung cancer indicator.

Methods: This retrospective study analyzed 7,670,212 record pairs from radiological exams conducted between 2020 and 2022 during the Moscow Computer Vision Experiment, focusing on CXR and computed tomography (CT) scans. All images were acquired during clinical routine. The final dataset comprised 100 CXR images (50 with lung nodules, 50 without), selected consecutively and based on inclusion and exclusion criteria, to evaluate the performance of all five AI-based solutions, participating in the Moscow Computer Vision Experiment and analyzing CXR. The evaluation was performed in 3 stages. In the first stage, the probability of a nodule in the lung obtained from AI services was compared with the Ground Truth (1-there is a nodule, 0-there is no nodule). In the second stage, 3 radiologists evaluated the segmentation of nodules performed by the AI services (1-nodule correctly segmented, 0-nodule incorrectly segmented or not segmented at all). In the third stage, the same radiologists additionally evaluated the classification of the nodules (1-nodule correctly segmented and classified, 0-all other cases). The results obtained in stages 2 and 3 were compared with Ground Truth, which was common to all three stages. For each stage, diagnostic accuracy metrics were calculated for each AI service.

Results: Three software solutions (Celsus, Lunit INSIGHT CXR, and qXR) demonstrated diagnostic metrics that matched or surpassed the vendor specifications, and achieved the highest area under the receiver operating characteristic curve (AUC) of 0.956 [95% confidence interval (CI): 0.918 to 0.994]. However, when evaluated by three radiologists for accurate nodule segmentation and classification, all solutions performed below the vendor-declared metrics, with the highest AUC reaching 0.812 (95% CI: 0.744 to 0.879). Meanwhile, all AI services demonstrated 100% specificity at stages 2 and 3 of the study.

Conclusions: To ensure the reliability and applicability of AI-based software, it is crucial to validate performance metrics using high-quality datasets and engage radiologists in the evaluation process. Developers are recommended to improve the accuracy of the underlying models before allowing the standalone use of the software for lung nodule detection. The dataset created during the study may be accessed at https://mosmed.ai/datasets/mosmeddatargogksnalichiemiotsutstviemlegochnihuzlovtipvii/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11320553PMC
http://dx.doi.org/10.21037/qims-24-160DOI Listing

Publication Analysis

Top Keywords

lung nodules
12
artificial intelligence
8
chest x-rays
8
moscow computer
8
computer vision
8
vision experiment
8
stages stage
8
compared ground
8
ground truth
8
stage radiologists
8

Similar Publications

The purpose of this study was to evaluate whether the optimal operating points of adult-oriented artificial intelligence (AI) software differ for pediatric chest radiographs and to assess its diagnostic performance. Chest radiographs from patients under 19 years old, collected between March and November 2021, were divided into test and exploring sets. A commercial adult-oriented AI software was utilized to detect lung lesions, including pneumothorax, consolidation, nodule, and pleural effusion, using a standard operating point of 15%.

View Article and Find Full Text PDF

Early Lung Adenocarcinoma Manifesting as Irregular Subsolid Nodules: Clinical and CT Characteristics.

Acad Radiol

December 2024

Department of Radiology, The First Affiliated Hospital of Chongqing Medical University, No. 1 Youyi Road, Yuzhong District, Chongqing 400016, China (P.-l.Z., T.-y.L., F.-j.L., Q.L.). Electronic address:

Rationale And Objectives: To explore the clinical and computed tomography (CT) characteristics of early-stage lung adenocarcinoma (LADC) that presents with an irregular shape.

Materials And Methods: The CT data of 575 patients with stage IA LADC and 295 with persistent inflammatory lesion (PIL) manifesting as subsolid nodules (SSNs) were analyzed retrospectively. Among these patients, we selected 233 patients with LADC and 140 patients with PIL, who showed irregular SSNs, hereinafter referred to as irregular LADC (I-LADC) and irregular PIL (I-PIL), respectively.

View Article and Find Full Text PDF

Interobserver Variability in Manual Versus Semi-Automatic CT Assessments of Small Lung Nodule Diameter and Volume.

Tomography

December 2024

Centre for Research and Development, Uppsala University, Region Gävleborg, SE 801 88 Gävle, Sweden.

Background: This study aimed to assess the interobserver variability of semi-automatic diameter and volumetric measurements versus manual diameter measurements for small lung nodules identified on computed tomography scans.

Methods: The radiological patient database was searched for CT thorax examinations with at least one noncalcified solid nodule (∼3-10 mm). Three radiologists with four to six years of experience evaluated each nodule in accordance with the Fleischner Society guidelines using standard diameter measurements, semi-automatic lesion diameter measurements, and volumetric assessments.

View Article and Find Full Text PDF

Objective: Image-guided diagnosis and treatment of lung lesions is an active area of research. With the growing number of solutions proposed, there is also a growing need to establish a standard for the evaluation of these solutions. Thus, realistic phantom and preclinical environments must be established.

View Article and Find Full Text PDF

Background: Determining the benign or malignant status of indeterminate pulmonary nodules (IPN) with intermediate malignancy risk is a significant clinical challenge. Oral microbiota-lung cancer interactions have qualified oral microbiota as a promising non-invasive predictive biomarker in IPN.

Materials And Methods: Prospectively collected saliva, throat swabs, and tongue coating samples from 1040 IPN patients and 70 healthy controls across three hospitals.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!