Triple-negative breast cancer (TNBC) remains one of the most challenging subtypes of breast cancer to treat due to a lack of effective targeted therapies. Chimeric antigen receptor (CAR)-T cells hold promise, but their efficacy in solid tumors is often limited by on-target/off-tumor toxicities. Through comprehensive bioinformatic analysis of public RNA and proteomic data, we identified zona pellucida glycoprotein 4 (ZP4) as a novel target for TNBC.
View Article and Find Full Text PDFWe developed a Bayesian-based algorithm to infer gene expression states in individual samples and incorporated it into a workflow to identify tumor-associated antigens (TAAs) across 33 cancer types using RNA sequencing (RNA-seq) data from the Genotype-Tissue Expression (GTEx) and The Cancer Genome Atlas (TCGA). Our analysis identified 212 candidate TAAs, with 78 validated in independent RNA-seq datasets spanning seven cancer types. Eighteen of these TAAs were further corroborated by proteomics data, including 10 linked to liver cancer.
View Article and Find Full Text PDFFewer than 200 proteins are targeted by cancer drugs approved by the Food and Drug Administration (FDA). We integrate Clinical Proteomic Tumor Analysis Consortium (CPTAC) proteogenomics data from 1,043 patients across 10 cancer types with additional public datasets to identify potential therapeutic targets. Pan-cancer analysis of 2,863 druggable proteins reveals a wide abundance range and identifies biological factors that affect mRNA-protein correlation.
View Article and Find Full Text PDFShotgun phosphoproteomics enables high-throughput analysis of phosphopeptides in biological samples. One of the primary challenges associated with this technology is the relatively low rate of phosphopeptide identification during data analysis. This limitation hampers the full realization of the potential offered by shotgun phosphoproteomics.
View Article and Find Full Text PDFShotgun proteomics is essential for protein identification and quantification in biomedical research, but protein isoform characterization is challenging due to the extensive number of peptides shared across proteins, hindering our understanding of protein isoform regulation and their roles in normal and disease biology. We systematically assess the challenge and opportunities of shotgun proteomics-based protein isoform characterization using in silico and experimental data, and then present SEPepQuant, a graph theory-based approach to maximize isoform characterization. Using published data from one induced pluripotent stem cell study and two human hepatocellular carcinoma studies, we demonstrate the ability of SEPepQuant in addressing the key limitations of existing methods, providing more comprehensive isoform-level characterization, identifying hundreds of isoform-level regulation events, and facilitating streamlined cross-study comparisons.
View Article and Find Full Text PDFBy combining mass-spectrometry-based proteomics and phosphoproteomics with genomics, epi-genomics, and transcriptomics, proteogenomics provides comprehensive molecular characterization of cancer. Using this approach, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) has characterized over 1,000 primary tumors spanning 10 cancer types, many with matched normal tissues. Here, we present LinkedOmicsKB, a proteogenomics data-driven knowledge base that makes consistently processed and systematically precomputed CPTAC pan-cancer proteogenomics data available to the public through ∼40,000 gene-, protein-, mutation-, and phenotype-centric web pages.
View Article and Find Full Text PDFWe characterized a prospective endometrial carcinoma (EC) cohort containing 138 tumors and 20 enriched normal tissues using 10 different omics platforms. Targeted quantitation of two peptides can predict antigen processing and presentation machinery activity, and may inform patient selection for immunotherapy. Association analysis between MYC activity and metformin treatment in both patients and cell lines suggests a potential role for metformin treatment in non-diabetic patients with elevated MYC activity.
View Article and Find Full Text PDFShotgun phosphoproteomics enables high-throughput analysis of phosphopeptides in biological samples, but low phosphopeptide identification rate in data analysis limits the potential of this technology. Here we present DeepRescore2, a computational workflow that leverages deep learning-based retention time and fragment ion intensity predictions to improve phosphopeptide identification and phosphosite localization. Using a state-of-the-art computational workflow as a benchmark, DeepRescore2 increases the number of correctly identified peptide-spectrum matches by 17% in a synthetic dataset and identifies 19%-46% more phosphopeptides in biological datasets.
View Article and Find Full Text PDFWe performed proteogenomic characterization of intrahepatic cholangiocarcinoma (iCCA) using paired tumor and adjacent liver tissues from 262 patients. Integrated proteogenomic analyses prioritized genetic aberrations and revealed hallmarks of iCCA pathogenesis. Aflatoxin signature was associated with tumor initiation, proliferation, and immune suppression.
View Article and Find Full Text PDFComprehensive characterization of tumor antigens is essential for the design of cancer immunotherapies, and mass spectrometry (MS)-based immunopeptidomics enables high-throughput identification of major histocompatibility complex (MHC)-bound peptide antigens . Here we construct an immunopeptidome atlas of human cancer through an extensive collection of 43 published immunopeptidomic datasets and standardized analysis of 81.6 million MS/MS spectra using an open search engine.
View Article and Find Full Text PDFBMC Bioinformatics
May 2020
Background: In shotgun proteomics, database searching of tandem mass spectra results in a great number of peptide-spectrum matches (PSMs), many of which are false positives. Quality control of PSMs is a multiple hypothesis testing problem, and the false discovery rate (FDR) or the posterior error probability (PEP) is the commonly used statistical confidence measure. PEP, also called local FDR, can evaluate the confidence of individual PSMs and thus is more desirable than FDR, which evaluates the global confidence of a collection of PSMs.
View Article and Find Full Text PDFIdentifying single amino acid variants (SAAVs) in cancer is critical for precision oncology. Several advanced algorithms are now available to identify SAAVs, but attempts to combine different algorithms and optimize them on large data sets to achieve a more comprehensive coverage of SAAVs have not been implemented. Herein, we report an expanded detection of SAAVs in the PANC-1 cell line using three different strategies, which results in the identification of 540 SAAVs in the mass spectrometry data.
View Article and Find Full Text PDFWe have performed deep proteomic profiling down to as few as 9 Panc-1 cells using sample fractionation, TMT multiplexing, and a carrier/reference strategy. Off line fractionation of the TMT-labeled sample pooled with TMT-labeled carrier Panc-1 whole cell proteome was achieved using alkaline reversed phase spin columns. The fractionation in conjunction with the carrier/reference (C/R) proteome allowed us to detect 47 414 unique peptides derived from 6261 proteins, which provided a sufficient coverage to search for single amino acid variants (SAAVs) related to cancer.
View Article and Find Full Text PDFJ Proteomics
September 2018
Unlabelled: Study of single amino acid variations (SAVs) of proteins, resulting from single nucleotide polymorphisms, is of great importance for understanding the relationships between genotype and phenotype. In mass spectrometry based shotgun proteomics, identification of peptides with SAVs often suffers from high error rates on the variant sites detected. These site errors are due to multiple reasons and can be confirmed by manual inspection or genomic sequencing.
View Article and Find Full Text PDF