Motivation: Quantitative mass spectrometry-based proteomics requires protein-level estimates and associated confidence measures. Challenges include the presence of low quality or incorrectly identified peptides and informative missingness. Furthermore, models are required for rolling peptide-level information up to the protein level.
Results: We present a statistical model that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference. The model is applicable to both label-based and label-free quantitation experiments. We also provide automated, model-based, algorithms for filtering of proteins and peptides as well as imputation of missing values. Two LC/MS datasets are used to illustrate the methods. In simulation studies, our methods are shown to achieve substantially more discoveries than standard alternatives.
Availability: The software has been made available in the open-source proteomics platform DAnTE (http://omics.pnl.gov/software/).
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723007 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btp362 | DOI Listing |
Biometrics
January 2025
MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SR, United Kingdom.
Dynamic treatment regimes (DTRs) formalize medical decision-making as a sequence of rules for different stages, mapping patient-level information to recommended treatments. In practice, estimating an optimal DTR using observational data from electronic medical record (EMR) databases can be complicated by nonignorable missing covariates resulting from informative monitoring of patients. Since complete case analysis can provide consistent estimation of outcome model parameters under the assumption of outcome-independent missingness, Q-learning is a natural approach to accommodating nonignorable missing covariates.
View Article and Find Full Text PDFHealthc Technol Lett
December 2024
Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems Ulster University, Magee campus Derry∼Londonderry Northern Ireland UK.
Missing Alzheimer's disease (AD) data is prevalent and poses significant challenges for AD diagnosis. Previous studies have explored various data imputation approaches on AD data, but the systematic evaluation of deep learning algorithms for imputing heterogeneous and comprehensive AD data is limited. This study investigates the efficacy of denoising autoencoder-based imputation of missing key features of heterogeneous data that comprised tau-PET, MRI, cognitive and functional assessments, genotype, sociodemographic, and medical history.
View Article and Find Full Text PDFOpen Heart
December 2024
Cedars-Sinai Medical Center, Los Angeles, California, USA
Background: Cardiac amyloidosis (CA) is an underdiagnosed, progressive and lethal disease. Machine learning applied to common measurements derived from routine echocardiogram studies can inform suspicion of CA.
Objectives: Our objectives were to test a random forest (RF) model in detecting CA.
Biometrics
October 2024
Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA 02115, United States.
Analyses of cluster randomized trials (CRTs) can be complicated by informative missing outcome data. Methods such as inverse probability weighted generalized estimating equations have been proposed to account for informative missingness by weighing the observed individual outcome data in each cluster. These existing methods have focused on settings where missingness occurs at the individual level and each cluster has partially or fully observed individual outcomes.
View Article and Find Full Text PDFEur J Hum Genet
December 2024
Key Laboratory of Endocrinology of National Health Commission, Department of Endocrinology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 100730, Beijing, China.
Xia-Gibbs syndrome (XGS) is a rare neurodevelopmental disorder with considerable clinical heterogeneity. To further characterize the syndrome's heterogeneity, we applied latent class analysis (LCA) on reported cases to identify phenotypic subtypes. By searching PubMed, Embase, China National Knowledge Infrastructure and Wanfang databases from inception to February 2024, we enrolled 97 cases with nonsense, frameshift or missense variants in the AHDC1 gene.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!