Performance Comparison of Individual and Ensemble CNN Models for the Classification of Brain 18F-FDG-PET Scans.

J Digit Imaging

Division of Nuclear Medicine and Molecular Imaging, Department of Radiology, Stanford University, 300 Pasteur Drive, Office H2228, Stanford, CA, 94305, USA.

Published: April 2020

Unlabelled: The high-background glucose metabolism of normal gray matter on [18F]-fluoro-2-D-deoxyglucose (FDG) positron emission tomography (PET) of the brain results in a low signal-to-background ratio, potentially increasing the possibility of missing important findings in patients with intracranial malignancies. To explore the strategy of using a deep learning classifier to aid in distinguishing normal versus abnormal findings on PET brain images, this study evaluated the performance of a two-dimensional convolutional neural network (2D-CNN) to classify FDG PET brain scans as normal (N) or abnormal (A).

Methods: Two hundred eighty-nine brain FDG-PET scans (N; n = 150, A; n = 139) resulting in a total of 68,260 images were included. Nine individual 2D-CNN models with three different window settings for axial, coronal, and sagittal axes were trained and validated. The performance of these individual and ensemble models was evaluated and compared using a test dataset. Odds ratio, Akaike's information criterion (AIC), and area under curve (AUC) on receiver-operative-characteristic curve, accuracy, and standard deviation (SD) were calculated.

Results: An optimal window setting to classify normal and abnormal scans was different for each axis of the individual models. An ensembled model using different axes with an optimized window setting (window-triad) showed better performance than ensembled models using the same axis and different windows settings (axis-triad). Increase in odds ratio and decrease in SD were observed in both axis-triad and window-triad models compared with individual models, whereas improvements of AUC and AIC were seen in window-triad models. An overall model averaging the probabilities of all individual models showed the best accuracy of 82.0%.

Conclusions: Data ensemble using different window settings and axes was effective to improve 2D-CNN performance parameters for the classification of brain FDG-PET scans. If prospectively validated with a larger cohort of patients, similar models could provide decision support in a clinical setting.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7165199PMC
http://dx.doi.org/10.1007/s10278-019-00289-xDOI Listing

Publication Analysis

Top Keywords

pet brain
12
individual models
12
models
10
individual ensemble
8
classification brain
8
normal abnormal
8
brain fdg-pet
8
fdg-pet scans
8
window settings
8
odds ratio
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!