Background/objectives: We assessed the influence of local patients and clinical characteristics on the performance of commercial deep learning (DL) segmentation models for head-and-neck (HN), breast, and prostate cancers.
Methods: Clinical computed tomography (CT) scans and clinically approved contours of 210 patients (53 HN, 49 left breast, 55 right breast, and 53 prostate cancer) were used to train and validate segmentation models integrated within a vendor-supplied DL training toolkit and to assess the performance of both vendor-pretrained and custom-trained models. Four custom models (HN, left breast, right breast, and prostate) were trained and validated with 30 (training)/5 (validation) HN, 34/5 left breast, 39/5 right breast, and 30/5 prostate patients to auto-segment a total of 24 organs at risk (OARs). Subsequently, both vendor-pretrained and custom-trained models were tested on the remaining patients from each group. Auto-segmented contours were evaluated by comparing them with clinically approved contours via the Dice similarity coefficient (DSC) and mean surface distance (MSD). The performance of the left and right breast models was assessed jointly according to ipsilateral/contralateral locations.
Results: The average DSCs for all structures in vendor-pretrained and custom-trained models were as follows: 0.81 ± 0.12 and 0.86 ± 0.11 in HN; 0.67 ± 0.16 and 0.80 ± 0.11 in the breast; and 0.87 ± 0.09 and 0.92 ± 0.06 in the prostate. The corresponding average MSDs were 0.81 ± 0.76 mm and 0.76 ± 0.56 mm (HN), 4.85 ± 2.44 mm and 2.42 ± 1.49 mm (breast), and 2.17 ± 1.39 mm and 1.21 ± 1.00 mm (prostate). Notably, custom-trained models showed significant improvements over vendor-pretrained models for 14 of 24 OARs, reflecting the influence of data/contouring variations in segmentation performance.
Conclusions: These findings underscore the substantial impact of institutional preferences and clinical practices on the implementation of vendor-pretrained models. We also found that a relatively small amount of institutional data was sufficient to train customized segmentation models with sufficient accuracy.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3390/diagnostics14242851 | DOI Listing |
Diagnostics (Basel)
December 2024
The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA.
Background/objectives: We assessed the influence of local patients and clinical characteristics on the performance of commercial deep learning (DL) segmentation models for head-and-neck (HN), breast, and prostate cancers.
Methods: Clinical computed tomography (CT) scans and clinically approved contours of 210 patients (53 HN, 49 left breast, 55 right breast, and 53 prostate cancer) were used to train and validate segmentation models integrated within a vendor-supplied DL training toolkit and to assess the performance of both vendor-pretrained and custom-trained models. Four custom models (HN, left breast, right breast, and prostate) were trained and validated with 30 (training)/5 (validation) HN, 34/5 left breast, 39/5 right breast, and 30/5 prostate patients to auto-segment a total of 24 organs at risk (OARs).
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!