The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments.

Ann Appl Stat

Department of Cell Biology, Harvard Medical School, 240 Longwood Ave, Boston, MA, 02115, USA; Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, 3101 McGavran-Greenberg Hall, CB 7420, Chapel Hill, NC 27599, USA; Department of Biochemistry and Biophysics University of North Carolina at Chapel Hill 120 Mason Farm Rd, Campus Box 7260 Chapel Hill, NC 27599 USA.

Published: December 2018

An idealized version of a label-free discovery mass spectrometry proteomics experiment would provide absolute abundance measurements for a whole proteome, across varying conditions. Unfortunately, this ideal is not realized. Measurements are made on peptides requiring an inferential step to obtain protein level estimates. The inference is complicated by experimental factors that necessitate relative abundance estimation and result in widespread non-ignorable missing data. Relative abundance on the log scale takes the form of parameter contrasts. In a complete-case analysis, contrast estimates may be biased by missing data and a substantial amount of useful information will often go unused. To avoid problems with missing data, many analysts have turned to single imputation solutions. Unfortunately, these methods often create further difficulties by hiding inestimable contrasts, preventing the recovery of interblock information and failing to account for imputation uncertainty. To mitigate many of the problems caused by missing values, we propose the use of a Bayesian selection model. Our model is tested on simulated data, real data with simulated missing values, and on a ground truth dilution experiment where all of the true relative changes are known. The analysis suggests that our model, compared with various imputation strategies and complete-case analyses, can increase accuracy and provide substantial improvements to interval coverage.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249692PMC
http://dx.doi.org/10.1214/18-AOAS1144DOI Listing

Publication Analysis

Top Keywords

missing data
16
mass spectrometry
8
spectrometry proteomics
8
relative abundance
8
missing values
8
missing
6
data
6
effects nonignorable
4
nonignorable missing
4
data label-free
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!