Motivation: Quantitative mass spectrometry-based proteomics requires protein-level estimates and associated confidence measures. Challenges include the presence of low quality or incorrectly identified peptides and informative missingness. Furthermore, models are required for rolling peptide-level information up to the protein level.

Results: We present a statistical model that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference. The model is applicable to both label-based and label-free quantitation experiments. We also provide automated, model-based, algorithms for filtering of proteins and peptides as well as imputation of missing values. Two LC/MS datasets are used to illustrate the methods. In simulation studies, our methods are shown to achieve substantially more discoveries than standard alternatives.

Availability: The software has been made available in the open-source proteomics platform DAnTE (http://omics.pnl.gov/software/).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723007PMC
http://dx.doi.org/10.1093/bioinformatics/btp362DOI Listing

Publication Analysis

Top Keywords

informative missingness
8
statistical framework
4
framework protein
4
protein quantitation
4
quantitation bottom-up
4
bottom-up ms-based
4
ms-based proteomics
4
proteomics motivation
4
motivation quantitative
4
quantitative mass
4

Similar Publications

Weighted Q-learning for optimal dynamic treatment regimes with nonignorable missing covariates.

Biometrics

January 2025

MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SR, United Kingdom.

Dynamic treatment regimes (DTRs) formalize medical decision-making as a sequence of rules for different stages, mapping patient-level information to recommended treatments. In practice, estimating an optimal DTR using observational data from electronic medical record (EMR) databases can be complicated by nonignorable missing covariates resulting from informative monitoring of patients. Since complete case analysis can provide consistent estimation of outcome model parameters under the assumption of outcome-independent missingness, Q-learning is a natural approach to accommodating nonignorable missing covariates.

View Article and Find Full Text PDF

Autoencoder imputation of missing heterogeneous data for Alzheimer's disease classification.

Healthc Technol Lett

December 2024

Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems Ulster University, Magee campus Derry∼Londonderry Northern Ireland UK.

Missing Alzheimer's disease (AD) data is prevalent and poses significant challenges for AD diagnosis. Previous studies have explored various data imputation approaches on AD data, but the systematic evaluation of deep learning algorithms for imputing heterogeneous and comprehensive AD data is limited. This study investigates the efficacy of denoising autoencoder-based imputation of missing key features of heterogeneous data that comprised tau-PET, MRI, cognitive and functional assessments, genotype, sociodemographic, and medical history.

View Article and Find Full Text PDF

Background: Cardiac amyloidosis (CA) is an underdiagnosed, progressive and lethal disease. Machine learning applied to common measurements derived from routine echocardiogram studies can inform suspicion of CA.

Objectives: Our objectives were to test a random forest (RF) model in detecting CA.

View Article and Find Full Text PDF

Analyses of cluster randomized trials (CRTs) can be complicated by informative missing outcome data. Methods such as inverse probability weighted generalized estimating equations have been proposed to account for informative missingness by weighing the observed individual outcome data in each cluster. These existing methods have focused on settings where missingness occurs at the individual level and each cluster has partially or fully observed individual outcomes.

View Article and Find Full Text PDF

Phenotypic subtypes of Xia-Gibbs syndrome: a latent class analysis.

Eur J Hum Genet

December 2024

Key Laboratory of Endocrinology of National Health Commission, Department of Endocrinology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100730, Beijing, China.

Xia-Gibbs syndrome (XGS) is a rare neurodevelopmental disorder with considerable clinical heterogeneity. To further characterize the syndrome's heterogeneity, we applied latent class analysis (LCA) on reported cases to identify phenotypic subtypes. By searching PubMed, Embase, China National Knowledge Infrastructure and Wanfang databases from inception to February 2024, we enrolled 97 cases with nonsense, frameshift or missense variants in the AHDC1 gene.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!