Proteoform-predictor: Increasing the Phylogenetic Reach of Top-Down Proteomics.

J Proteome Res

Department of Molecular Biosciences, Northwestern University, Evanston, Illinois 60208, United States.

Published: March 2025

Proteoforms are distinct molecular forms of proteins that act as building blocks of organisms, with post-translational modifications (PTMs) being one of the key changes that generate these variations. Mass spectrometry (MS)-based top-down proteomics (TDP) is the leading technology for proteoform identification due to its preservation of intact proteoforms for analysis, making it well-suited for comprehensive PTM characterization. A crucial step in TDP is searching MS data against a database of candidate proteoforms. To extend the reach of TDP to organisms with limited PTM annotations, we developed Proteoform-predictor, an open-source tool that integrates homology-based PTM site prediction into proteoform database creation. The new tool creates databases of proteoform candidates after registration of homologous sequences, transferring PTM sites from well-characterized species to those with less comprehensive proteomic data. Our tool features a user-friendly interface and intuitive workflow, making it accessible to a wide range of researchers. We demonstrate that Proteoform-predictor expands proteoform databases with tens of thousands of proteoforms for three bacterial strains by comparing them to the reference proteome of () K12. Subsequent TDP analysis for () and () demonstrated significant improvement in protein and proteoform identification, even for proteins with variant sequences. As TDP technology advances, Proteoform-predictor will become an important tool for expanding the applicability of proteoform identification and PTM biology to more diverse species across the phylogenetic tree of life.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jproteome.4c00943DOI Listing

Publication Analysis

Top Keywords

proteoform identification
12
top-down proteomics
8
proteoform
6
tdp
5
ptm
5
proteoform-predictor
4
proteoform-predictor increasing
4
increasing phylogenetic
4
phylogenetic reach
4
reach top-down
4

Similar Publications

High-Throughput Proteoform Imaging for Revealing Spatial-Resolved Changes in Brain Tissues Associated with Alzheimer's Disease.

Adv Sci (Weinh)

March 2025

State Key Laboratory of Medical Proteomics, National Chromatographic R. & A. Center, CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, 457 Zhongshan Road, Dalian, 116023, China.

Spatially resolved characterization of proteoforms has substantial potential to significantly advance the understanding of physiological and disease mechanisms. However, challenges remain regarding throughput and coverage. A robust method is developed for high-throughput proteoform imaging (HTPi) by combining matrix-assisted laser desorption ionization mass spectrometry imaging (MALDI MSI) and region-specific top-down proteomic analysis.

View Article and Find Full Text PDF

Proteoforms are distinct molecular forms of proteins that act as building blocks of organisms, with post-translational modifications (PTMs) being one of the key changes that generate these variations. Mass spectrometry (MS)-based top-down proteomics (TDP) is the leading technology for proteoform identification due to its preservation of intact proteoforms for analysis, making it well-suited for comprehensive PTM characterization. A crucial step in TDP is searching MS data against a database of candidate proteoforms.

View Article and Find Full Text PDF

Hepatocellular carcinoma (HCC) also known as hepatocellular cancer is one of the most common and aggressive types of primary malignant liver neoplasms. This type of cancer accounts for up to 90% of all primary liver tumors and is the third leading cause of cancer death worldwide. Despite the advances in modern medicine, diagnostics and treatment of HCC remain challenging, especially in the later stages, when the patient's prognosis significantly worsens and treatment options are very limited.

View Article and Find Full Text PDF

Thousands of short open reading frames (sORFs) are translated outside of annotated coding sequences. Recent studies have pioneered searching for sORF-encoded microproteins in mass spectrometry (MS)-based proteomics and peptidomics datasets. Here, we assessed literature-reported MS-based identifications of unannotated human proteins.

View Article and Find Full Text PDF

In the rapidly evolving field of nanomedicine, understanding the interactions between nanoparticles (NPs) and biological systems is crucial. A pivotal aspect of these interactions is the formation of a protein corona when NPs are exposed to biological fluids (e.g.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!