Robust classification modeling on microarray data using misclassification penalized posterior.

Bioinformatics

Division of Biometrics III, Food and Drug Administration 9201 Corporate Blvd, Rm. N-250, Rockville, MD 20850, USA.

Published: June 2005

Motivation: Genome-wide microarray data are often used in challenging classification problems of clinically relevant subtypes of human diseases. However, the identification of a parsimonious robust prediction model that performs consistently well on future independent data has not been successful due to the biased model selection from an extremely large number of candidate models during the classification model search and construction. Furthermore, common criteria of prediction model performance, such as classification error rates, do not provide a sensitive measure for evaluating performance of such astronomic competing models. Also, even though several different classification approaches have been utilized to tackle such classification problems, no direct comparison on these methods have been made.

Results: We introduce a novel measure for assessing the performance of a prediction model, the misclassification-penalized posterior (MiPP), the sum of the posterior classification probabilities penalized by the number of incorrectly classified samples. Using MiPP, we implement a forward step-wise cross-validated procedure to find our optimal prediction models with different numbers of features on a training set. Our final robust classification model and its dimension are determined based on a completely independent test dataset. This MiPP-based classification modeling approach enables us to identify the most parsimonious robust prediction models only with two or three features on well-known microarray datasets. These models show superior performance to other models in the literature that often have more than 40-100 features in their model construction.

Availability: Our MiPP software program is available at the Bioconductor website (http://www.bioconductor.org).

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bti1020DOI Listing

Publication Analysis

Top Keywords

prediction model
12
robust classification
8
classification modeling
8
microarray data
8
classification
8
classification problems
8
parsimonious robust
8
robust prediction
8
models classification
8
classification model
8

Similar Publications

Objective: This study investigates the relationship between the albumin-to-creatinine ratio and diabetic retinopathy (DR) in US adults using NHANES data from 2009 to 2016. This study assesses the predictive efficacy of the urinary serum albumin-to-creatinine ratio (UACR/SACR Ratio) against traditional biomarkers such as the serum albumin-to-creatinine ratio (SACR) and urinary albumin-to-creatinine ratio (UACR) for evaluating DR risk. Additionally, the study explores the potential of these biomarkers, both individually and in combination with HbA1c, for early detection and risk stratification of DR.

View Article and Find Full Text PDF

Circadian Misalignment Impacts Cardiac Autonomic Modulation in Adolescence.

Sleep

January 2025

Sleep Research & Treatment Center, Department of Psychiatry & Behavioral Health, Penn State University, College of Medicine, Hershey PA, USA.

Study Objectives: Although heart rate variability (HRV), a marker of cardiac autonomic modulation (CAM), is known to predict cardiovascular morbidity, the circadian timing of sleep (CTS) is also involved in autonomic modulation. We examined whether circadian misalignment is associated with blunted HRV in adolescents as a function of entrainment to school or on-breaks.

Methods: We evaluated 360 subjects from the Penn State Child Cohort (median 16y) who had at least 3-night at-home actigraphy (ACT), in-lab 9-h polysomnography (PSG) and 24-h Holter-monitoring heart rate variability (HRV) data.

View Article and Find Full Text PDF

Liver cancer poses a global health challenge with limited therapeutic options. Notably, the limited success of current therapies in patients with primary liver cancers (PLCs) may be attributed to the high heterogeneity of both hepatocellular carcinoma (HCCs) and intrahepatic cholangiocarcinoma (iCCAs). This heterogeneity evolves over time as tumor-initiating stem cells, or cancer stem cells (CSCs), undergo (epi)genetic alterations or encounter microenvironmental changes within the tumor microenvironment.

View Article and Find Full Text PDF

A prediction model for electrical strength of gaseous medium based on molecular reactivity descriptors and machine learning method.

J Mol Model

January 2025

Hubei Key Laboratory·for High-Efficiency-Utilization of Solar Energy and Operation, Control of Energy-Storage System, Hubei-University of Technology, Wuhan, 430068, China.

Context: Ionization and adsorption in gas discharge are similar to electrophilic and nucleophilic reactions. The molecular descriptors characterizing reactions such as electrostatic potential descriptors are useful in predicting the electrical strength of environmentally friendly gases. In this study, descriptors of 73 molecules are employed for correlation analysis with electrical strength.

View Article and Find Full Text PDF

Objectives: To investigate the clinical and laboratory features of Sjögren's syndrome-associated autoimmune liver disease (SS-ALD) patients and identify potential risk and prognostic factors.

Methods: SS patients with or without ALD, who visited Tongji Hospital between the years 2011 and 2021 and met the 2012 American College of Rheumatology (ACR) classification criteria for Sjögren's syndrome, were retrospectively enrolled. The clinical and laboratory data of the enrolled patients, including autoimmune antibodies, were collected and analyzed with principal component analysis, correlation analysis, LASSO regression, and Cox regression.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!