Within the last several years, top-down proteomics has emerged as a high throughput technique for protein and proteoform identification. This technique has the potential to identify and characterize thousands of proteoforms within a single study, but the absence of accurate false discovery rate (FDR) estimation could hinder the adoption and consistency of top-down proteomics in the future. In automated identification and characterization of proteoforms, FDR calculation strongly depends on the context of the search. The context includes MS data quality, the database being interrogated, the search engine, and the parameters of the search. Particular to top-down proteomics-there are four molecular levels of study: proteoform spectral match (PrSM), protein, isoform, and proteoform. Here, a context-dependent framework for calculating an accurate FDR at each level was designed, implemented, and validated against a manually curated training set with 546 confirmed proteoforms. We examined several search contexts and found that an FDR calculated at the PrSM level under-reported the true FDR at the protein level by an average of 24-fold. We present a new open-source tool, the TDCD_FDR_Calculator, which provides a scalable, context-dependent FDR calculation that can be applied post-search to enhance the quality of results in top-down proteomics from any search engine.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6442365 | PMC |
http://dx.doi.org/10.1074/mcp.RA118.000993 | DOI Listing |
Bioinformatics
January 2025
Department of Computer Science, City University of Hong Kong, Hong Kong, China.
Motivation: Proteoforms are the different forms of a proteins generated from the genome with various sequence variations, splice isoforms, and post-translational modifications. Proteoforms regulate protein structures and functions. A single protein can have multiple proteoforms due to different modification sites.
View Article and Find Full Text PDFJ Proteome Res
January 2025
Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States.
Proteo-SAFARI is a shiny application for fragment assignment by relative isotopes, an R-based software application designed for identification of protein fragment ions directly in the / domain. This program provides an open-source, user-friendly application for identification of fragment ions from a candidate protein sequence with support for custom covalent modifications and various visualizations of identified fragments. Additionally, Proteo-SAFARI includes a nonnegative least-squares fitting approach to determine the contributions of various hydrogen shifted fragment ions ( + 1, + 1, - 1, - 2) observed in UVPD mass spectra which exhibit overlapping isotopic distributions.
View Article and Find Full Text PDFAnal Chem
January 2025
Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana 46556, United States.
Intact protein analysis using mass spectrometry (MS) is an important technique to characterize and provide a comprehensive overview of protein complexity. It is also the basis of "top-down" approaches in proteomics to describe the proteoforms of single protein's post-translational modifications (PTMs). MS-based analysis of intact proteins benefits from high-resolution separations prior to electrospray ionization.
View Article and Find Full Text PDFFront Cell Dev Biol
December 2024
Proteomics, Lipidomics and Metabolomics Core Facility, School of Life Sciences, Faculty of Science, University of Technology Sydney, Ultimo, NSW, Australia.
Introduction: The identification of effective, selective biomarkers and therapeutics is dependent on truly deep, comprehensive analysis of proteomes at the proteoform level.
Methods: Bovine serum albumin (BSA) isolated by two different protocols, cold ethanol fractionation and heat shock fractionation, was resolved and identified using Integrative Top-down Proteomics, the tight coupling of two-dimensional gel electrophoresis (2DE) with liquid chromatography and tandem mass spectrometry (LC-MS/MS).
Results And Discussion: Numerous proteoforms were identified in both "purified" samples, across a broad range of isoelectric points and molecular weights.
S-glutathionylation (SSG) is increasingly recognized as a critical signaling mechanism in the heart, yet SSG modifications in cardiac sarcomeric proteins remain understudied. Here we identified SSG of the ventricular isoform of myosin light chain 1 (MLC-1v) in human, swine, and mouse cardiac tissues using top-down mass spectrometry (MS)-based proteomics. Our results enabled the accurate identification, quantification, and site-specific localization of SSG in MLC-1v across different species.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!