A New Classification of Benign, Premalignant, and Malignant Endometrial Tissues Using Machine Learning Applied to 1413 Candidate Variables.

Int J Gynecol Pathol

Department of Pathology, Brigham and Women's Hospital (M.J.D., D.J.P., G.L.M.) Department of Pathology, Harvard Medical School (D.J.P., G.L.M.) Department of Biostatistics and Computational Biology, Dana Farber Cancer Institute (S.T.) Department of Biostatistics, Harvard T. H. Chan School of Public Health (S.T.), Boston, Massachusetts.

Published: July 2020

Benign normal (NL), premalignant (endometrial intraepithelial neoplasia, EIN) and malignant (cancer, EMCA) endometria must be precisely distinguished for optimal management. EIN was objectively defined previously as a regression model incorporating manually traced histologic variables to predict clonal growth and cancer outcomes. Results from this early computational study were used to revise subjective endometrial precancer diagnostic criteria currently in use. We here use automated feature segmentation and updated machine learning algorithms to develop a new classification algorithm. Endometrial tissue from 148 patients was randomly separated into 72-patient training and 76-patient validation cohorts encompassing all 3 diagnostic classes. We applied image analysis software to keratin stained endometrial tissues to automatically segment whole-slide digital images into epithelium, cells, and nuclei and extract corresponding variables. A total of 1413 variables were culled to 75 based on random forest classification performance in a 3-group (NL, EIN, EMCA) model. This algorithm correctly classifies cases with 3-class error rates of 0.04 (training set) and 0.058 (validation set); and 2-class (NL vs. EIN+EMCA) error rate of 0.016 (training set) and 0 (validation set). The 4 most heavily weighted variables are surrogates of those previously identified in manual-segmentation machine learning studies (stromal and epithelial area percentages, and normalized epithelial surface lengths). Lesser weighted predictors include gland and lumen axis lengths and ratios, and individual cell measures. Automated image analysis and random forest classification algorithms can classify normal, premalignant, and malignant endometrial tissues. Highest predictive variables overlap with those discovered independently in early models based on manual segmentation.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6884662PMC
http://dx.doi.org/10.1097/PGP.0000000000000615DOI Listing

Publication Analysis

Top Keywords

endometrial tissues
12
machine learning
12
premalignant malignant
8
malignant endometrial
8
normal premalignant
8
image analysis
8
random forest
8
forest classification
8
training set
8
validation set
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!