Purpose: To develop a radiomics prediction model to improve pulmonary nodule (PN) classification in low-dose CT. To compare the model with the American College of Radiology (ACR) Lung CT Screening Reporting and Data System (Lung-RADS) for early detection of lung cancer.

Methods: We examined a set of 72 PNs (31 benign and 41 malignant) from the Lung Image Database Consortium image collection (LIDC-IDRI). One hundred three CT radiomic features were extracted from each PN. Before the model building process, distinctive features were identified using a hierarchical clustering method. We then constructed a prediction model by using a support vector machine (SVM) classifier coupled with a least absolute shrinkage and selection operator (LASSO). A tenfold cross-validation (CV) was repeated ten times (10 × 10-fold CV) to evaluate the accuracy of the SVM-LASSO model. Finally, the best model from the 10 × 10-fold CV was further evaluated using 20 × 5- and 50 × 2-fold CVs.

Results: The best SVM-LASSO model consisted of only two features: the bounding box anterior-posterior dimension (BB_AP) and the standard deviation of inverse difference moment (SD_IDM). The BB_AP measured the extension of a PN in the anterior-posterior direction and was highly correlated (r = 0.94) with the PN size. The SD_IDM was a texture feature that measured the directional variation of the local homogeneity feature IDM. Univariate analysis showed that both features were statistically significant and discriminative (P = 0.00013 and 0.000038, respectively). PNs with larger BB_AP or smaller SD_IDM were more likely malignant. The 10 × 10-fold CV of the best SVM model using the two features achieved an accuracy of 84.6% and 0.89 AUC. By comparison, Lung-RADS achieved an accuracy of 72.2% and 0.77 AUC using four features (size, type, calcification, and spiculation). The prediction improvement of SVM-LASSO comparing to Lung-RADS was statistically significant (McNemar's test P = 0.026). Lung-RADS misclassified 19 cases because it was mainly based on PN size, whereas the SVM-LASSO model correctly classified 10 of these cases by combining a size (BB_AP) feature and a texture (SD_IDM) feature. The performance of the SVM-LASSO model was stable when leaving more patients out with five- and twofold CVs (accuracy 84.1% and 81.6%, respectively).

Conclusion: We developed an SVM-LASSO model to predict malignancy of PNs with two CT radiomic features. We demonstrated that the model achieved an accuracy of 84.6%, which was 12.4% higher than Lung-RADS.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5903960PMC
http://dx.doi.org/10.1002/mp.12820DOI Listing

Publication Analysis

Top Keywords

svm-lasso model
20
model
12
achieved accuracy
12
early detection
8
detection lung
8
prediction model
8
radiomic features
8
accuracy 846%
8
features
7
svm-lasso
6

Similar Publications

Background: The role of Mast cells has not been thoroughly explored in the context of prostate cancer's (PCA) unpredictable prognosis and mixed immunotherapy outcomes. Our research aims to employs a comprehensive computational methodology to evaluate Mast cell marker gene signatures (MCMGS) derived from a global cohort of 1091 PCA patients. This approach is designed to identify a robust biomarker to assist in prognosis and predicting responses to immunotherapy.

View Article and Find Full Text PDF

Background: Pulmonary arterial hypertension (PAH) represents a substantial global risk to human health. This study aims to identify diagnostic biomarkers for PAH and assess their association with the immune microenvironment through the utilization of sophisticated bioinformatics techniques.

Methods: Based on two microarray datasets, differentially expressed genes (DEGs) were detected, and hub genes underwent a sequence of machine learning analyses.

View Article and Find Full Text PDF
Article Synopsis
  • Sertraline is a widely used antidepressant, and a personalized model to predict its concentration can help optimize treatment and minimize side effects.
  • The study involved 415 patients to develop a machine learning model using various algorithms, ultimately selecting XGBoost for its superior performance.
  • Key predictors of sertraline concentration included dosage and liver enzyme levels, with the model achieving 62.5% accuracy in predicting therapeutic concentration, offering valuable guidance for clinicians in treatment planning.
View Article and Find Full Text PDF

Objective: To investigate the effect of COVID-19 infection on pancreatic cancer.

Methods: Based on the mRNA-Seq data of COVID-19 patients and pancreatic cancer (PC) patients in the GEO database, we used a support vector machine (SVM), LASSO-Cox regression analysis and random forest tree (RF) to screen the common signature genes of the two diseases and further investigate their effects and functional characteristics on PC, respectively. The above procedures were performed in R software.

View Article and Find Full Text PDF

Biomarkers associated with hepatocellular carcinoma (HCC) are of great importance to better understand biological response mechanisms to internal or external intervention. The study aimed to identify key candidate genes for HCC using machine learning (ML) and statistics-based bioinformatics models. Differentially expressed genes (DEGs) were identified using limma and then selected their common genes among DEGs identified from four datasets.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!