An interpretable machine learning prognostic system for risk stratification in oropharyngeal cancer.

Int J Med Inform

Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland; Department of Otorhinolaryngology - Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland; Division of Ear, Nose and Throat Diseases, Department of Clinical Sciences, Intervention and Technology, Karolinska Institute and Karolinska University Hospital, Stockholm, Sweden.

Published: December 2022

Background: The optimal management of oropharyngeal squamous cell carcinoma (OPSCC) includes both surgical and non-surgical, that is, (chemo)radiotherapy treatment options and their combinations. These approaches carry a risk of specific treatment-related side effects. HPV-positive OPSCC has been reported to be more sensitive to (chemo)radiotherapy-based treatment modalities.

Objectives: This study aims to demonstrate how machine learning can aid in classifying OPSCC patients into risk groups (low-chance or high-chance) for overall survival. We examined the input variables using permutation feature importance. Furthermore, we provided explanations and interpretations using the Local Interpretable Model Agnostic Explanations (LIME) and SHapley Additive Explanation (SHAP) frameworks.

Methods: The machine learning model for 3164 OPSCC patients was built using data obtained from the Surveillance, Epidemiology, and End Results (SEER) program database. A total of five variants of tree-based machine learning algorithms (voting ensemble, light GBM, XGBoost, Random Forest, and Extreme Random Trees) were used to divide the patients into risk groups. The developed model with the best predictive performance was temporally validated with a different cohort.

Results: The voting ensemble machine learning algorithm showed an accuracy of 88.3%, Mathews' correlation coefficient of 0.72, and weighted area under curve of 0.93, when temporally validated. Human papillomavirus (HPV) status, age of the patients, T stage, marital status, N stage, and the treatment modality (surgery with postoperative radiotherapy) were found to have the most significant effects on the ability of the machine learning model to predict overall survival. Similarly, for the individual patients with SHAP framework, HPV status, gender, and treatment modality (surgery with postoperative radiotherapy) were the input features that improved the model's prediction.

Conclusion: The proposed stratification of OPSCC patients into risk groups by machine learning techniques can provide accurate predictions and thus aid clinicians in administering early and personalized interventions. Clinicians could utilize the predicted risk with the explanations offered by the SHAP and LIME frameworks to understand previously undetected relationships between prognostic variables to make informed clinical decisions and effective interventions.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijmedinf.2022.104896DOI Listing

Publication Analysis

Top Keywords

machine learning
28
opscc patients
12
patients risk
12
risk groups
12
learning model
8
voting ensemble
8
temporally validated
8
hpv status
8
treatment modality
8
modality surgery
8

Similar Publications

Inherited genetics represents an important contributor to risk of esophageal adenocarcinoma (EAC), and its precursor Barrett's esophagus (BE). Genome-wide association studies have identified ∼30 susceptibility variants for BE/EAC, yet genetic interactions remain unexamined. To address challenges in large-scale G×G scans, we combined knowledge-guided filtering and machine learning approaches, focusing on genes with (A) known/plausible links to BE/EAC pathogenesis (n=493) or (B) prior evidence of biological interactions (n=4,196).

View Article and Find Full Text PDF

Cognitive load stimulates neural activity, essential for understanding the brain's response to stress-inducing stimuli or mental strain. This study examines the feasibility of evaluating cognitive load by extracting, selection, and classifying features from electroencephalogram (EEG) signals. We employed robust local mean decomposition (R-LMD) to decompose EEG data from each channel, recorded over a four-second period, into five modes.

View Article and Find Full Text PDF

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.

Sci Rep

January 2025

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.

We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models.

View Article and Find Full Text PDF

Machine learning techniques for non-destructive estimation of plum fruit weight.

Sci Rep

January 2025

Crop and Horticultural Science Research Department, Mazandaran Agricultural Resources Research and Education Center, Agricultural Research, Education and Extension Organization (AREEO), Tajrish, Iran.

Plum fruit fresh weight (FW) estimation is crucial for various agricultural practices, including yield prediction, quality control, and market pricing. Traditional methods for estimating fruit weight are often destructive, time-consuming, and labor-intensive. In this study, we addressed the problem of predicting plum FW using artificial intelligence (AI) methods based on fruit dimensions.

View Article and Find Full Text PDF

Cancer-associated fibroblasts (CAFs) significantly influence tumor progression and therapeutic resistance in colorectal cancer (CRC). However, the distributions and functions of CAF subpopulations vary across the four consensus molecular subtypes (CMSs) of CRC. This study performed single-cell RNA and bulk RNA sequencing and revealed that myofibroblast-like CAFs (myCAFs), tumor-like CAFs (tCAFs), inflammatory CAFs (iCAFs), CXCL14CAFs, and MTCAFs are notably enriched in CMS4 compared with other CMSs of CRC.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!