Motivation: In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need.

Results: In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories.

Conclusion: In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis.

Availability And Implementation: NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC.

Contact: ywang@amss.ac.cn

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btv374DOI Listing

Publication Analysis

Top Keywords

clinical data
20
multi-biomarker panel
16
auc optimization
12
cox model
12
ncc-auc
9
optimization method
8
identify multi-biomarker
8
genomic clinical
8
data
8
nearest centroid
8

Similar Publications

Objective: Scleroderma-associated autoantibodies (SSc-Abs) are specific in participants (pts) with systemic sclerosis and are associated with organ involvement. Our objective was to assess the influence of baseline SSc-Abs on the trajectories of the clinical outcome assessments (COAs) in a phase III randomized controlled trial.

Methods: We used data on both the groups who received placebo (Pbo) and tocilizumab from the focuSSced trial.

View Article and Find Full Text PDF

The severe functional impact of long COVID presents a significant challenge for clients seeking to return to work. Despite emerging clinical management guidelines, long COVID remains a concern in the rehabilitation field. There is a need to establish optimal practices for sustainable rehabilitation paths that enhance the recovery of clients with long COVID, all while understanding the challenges faced by rehabilitation professionals working with this population.

View Article and Find Full Text PDF

The prognostic impact of arterial spin labeling hyperperfusion in acute ischemic stroke: a systematic review and meta-analysis.

Acta Radiol

January 2025

Department of Radiology & Institute of Rehabilitation and Development of Brain Function, Nanchong Central Hospital, The Second Clinical Medical College of North Sichuan Medical College, Nanchong, Sichuan, PR China.

Hyperperfusion is related to vessel recanalization, tissue reperfusion, and collateral circulation. To determine the prognostic impact of hyperperfusion after an acute ischemic stroke (AIS) identified by arterial spin labeling (ASL) cerebral blood flow. Studies published in PubMed, Embase, and Cochrane Library databases were searched.

View Article and Find Full Text PDF

Background: Established risk models may not be applicable to patients at higher cardiovascular risk with a measured Lp(a) (lipoprotein[a]) level, a causal risk factor for atherosclerotic cardiovascular disease.

Methods: This was a model development study. The data source was the Nashville Biosciences Lp(a) data set, which includes clinical data from the Vanderbilt University Health System.

View Article and Find Full Text PDF

Clinical trials (CTs) often suffer from small sample sizes due to limited budgets and patient enrollment challenges. Using historical data for the CT data analysis may boost statistical power and reduce the required sample size. Existing methods on borrowing information from historical data with right-censored outcomes did not consider matching between historical data and CT data to reduce the heterogeneity.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!