Publications by authors named "Zhiao Shi"

Article Synopsis
  • Large-scale omics profiling has highlighted numerous somatic mutations and cancer-related proteins, making it difficult to understand their functions in cancer biology.
  • The FunMap network is developed using machine learning on data from 1,194 individuals across 11 cancer types, accurately linking over 10,500 protein-coding genes and identifying important functional protein modules.
  • This study positions FunMap as a valuable tool for interpreting complex cancer data, helping to predict the roles of lesser-known cancer-associated proteins and enhancing strategies for cancer treatment and research.
View Article and Find Full Text PDF

Fewer than 200 proteins are targeted by cancer drugs approved by the Food and Drug Administration (FDA). We integrate Clinical Proteomic Tumor Analysis Consortium (CPTAC) proteogenomics data from 1,043 patients across 10 cancer types with additional public datasets to identify potential therapeutic targets. Pan-cancer analysis of 2,863 druggable proteins reveals a wide abundance range and identifies biological factors that affect mRNA-protein correlation.

View Article and Find Full Text PDF

Enrichment analysis, crucial for interpreting genomic, transcriptomic, and proteomic data, is expanding into metabolomics. Furthermore, there is a rising demand for integrated enrichment analysis that combines data from different studies and omics platforms, as seen in meta-analysis and multi-omics research. To address these growing needs, we have updated WebGestalt to include enrichment analysis capabilities for both metabolites and multiple input lists of analytes.

View Article and Find Full Text PDF

Global phosphoproteomics experiments quantify tens of thousands of phosphorylation sites. However, data interpretation is hampered by our limited knowledge on functions, biological contexts, or precipitating enzymes of the phosphosites. This study establishes a repository of phosphosites with associated evidence in biomedical abstracts, using deep learning-based natural language processing techniques.

View Article and Find Full Text PDF

By combining mass-spectrometry-based proteomics and phosphoproteomics with genomics, epi-genomics, and transcriptomics, proteogenomics provides comprehensive molecular characterization of cancer. Using this approach, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) has characterized over 1,000 primary tumors spanning 10 cancer types, many with matched normal tissues. Here, we present LinkedOmicsKB, a proteogenomics data-driven knowledge base that makes consistently processed and systematically precomputed CPTAC pan-cancer proteogenomics data available to the public through ∼40,000 gene-, protein-, mutation-, and phenotype-centric web pages.

View Article and Find Full Text PDF
Article Synopsis
  • - The National Cancer Institute's CPTAC focuses on analyzing tumors using a proteogenomic approach, which combines genomic data with proteomic information to better understand cancer.
  • - The consortium has developed a comprehensive dataset that includes genomic, transcriptomic, proteomic, and clinical data from over 1000 tumors across 10 different groups, aimed at enhancing cancer research.
  • - The CPTAC team addresses challenges in integrating and analyzing multi-omics data, especially the complexities arising from combining nucleotide sequencing with mass spectrometry proteomics information.
View Article and Find Full Text PDF

We characterized a prospective endometrial carcinoma (EC) cohort containing 138 tumors and 20 enriched normal tissues using 10 different omics platforms. Targeted quantitation of two peptides can predict antigen processing and presentation machinery activity, and may inform patient selection for immunotherapy. Association analysis between MYC activity and metformin treatment in both patients and cell lines suggests a potential role for metformin treatment in non-diabetic patients with elevated MYC activity.

View Article and Find Full Text PDF

Unlabelled: Microscaled proteogenomics was deployed to probe the molecular basis for differential response to neoadjuvant carboplatin and docetaxel combination chemotherapy for triple-negative breast cancer (TNBC). Proteomic analyses of pretreatment patient biopsies uniquely revealed metabolic pathways, including oxidative phosphorylation, adipogenesis, and fatty acid metabolism, that were associated with resistance. Both proteomics and transcriptomics revealed that sensitivity was marked by elevation of DNA repair, E2F targets, G2-M checkpoint, interferon-gamma signaling, and immune-checkpoint components.

View Article and Find Full Text PDF

Pancreatic ductal adenocarcinoma (PDAC) is a highly aggressive cancer with poor patient survival. Toward understanding the underlying molecular alterations that drive PDAC oncogenesis, we conducted comprehensive proteogenomic analysis of 140 pancreatic cancers, 67 normal adjacent tissues, and 9 normal pancreatic ductal tissues. Proteomic, phosphoproteomic, and glycoproteomic analyses were used to characterize proteins and their modifications.

View Article and Find Full Text PDF

Sample mislabeling or misannotation has been a long-standing problem in scientific research, particularly prevalent in large-scale, multi-omic studies due to the complexity of multi-omic workflows. There exists an urgent need for implementing quality controls to automatically screen for and correct sample mislabels or misannotations in multi-omic studies. Here, we describe a crowdsourced precisionFDA NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge, which provides a framework for systematic benchmarking and evaluation of mislabel identification and correction methods for integrative proteogenomic studies.

View Article and Find Full Text PDF

Untargeted mass spectrometry (MS)-based proteomics provides a powerful platform for protein biomarker discovery, but clinical translation depends on the selection of a small number of proteins for downstream verification and validation. Due to the small sample size of typical discovery studies, protein markers identified from discovery data may not be generalizable to independent datasets. In addition, a good protein marker identified using a discovery platform may be difficult to implement in verification and validation platforms.

View Article and Find Full Text PDF

We present a proteogenomic study of 108 human papilloma virus (HPV)-negative head and neck squamous cell carcinomas (HNSCCs). Proteomic analysis systematically catalogs HNSCC-associated proteins and phosphosites, prioritizes copy number drivers, and highlights an oncogenic role for RNA processing genes. Proteomic investigation of mutual exclusivity between FAT1 truncating mutations and 11q13.

View Article and Find Full Text PDF

The integration of mass spectrometry-based proteomics with next-generation DNA and RNA sequencing profiles tumors more comprehensively. Here this "proteogenomics" approach was applied to 122 treatment-naive primary breast cancers accrued to preserve post-translational modifications, including protein phosphorylation and acetylation. Proteogenomics challenged standard breast cancer diagnoses, provided detailed analysis of the ERBB2 amplicon, defined tumor subsets that could benefit from immune checkpoint therapy, and allowed more accurate assessment of Rb status for prediction of CDK4/6 inhibitor responsiveness.

View Article and Find Full Text PDF

Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale.

View Article and Find Full Text PDF

We undertook a comprehensive proteogenomic characterization of 95 prospectively collected endometrial carcinomas, comprising 83 endometrioid and 12 serous tumors. This analysis revealed possible new consequences of perturbations to the p53 and Wnt/β-catenin pathways, identified a potential role for circRNAs in the epithelial-mesenchymal transition, and provided new information about proteomic markers of clinical and genomic tumor subgroups, including relationships to known druggable pathways. An extensive genome-wide acetylation survey yielded insights into regulatory mechanisms linking Wnt signaling and histone acetylation.

View Article and Find Full Text PDF

Gene set analysis plays a critical role in the functional interpretation of omics data. Although this is typically done for one omics experiment at a time, there is an increasing need to combine gene set analysis results from multiple experiments performed on the same or different omics platforms, such as in multi-omics studies. Integrating results from multiple experiments is challenging, and annotation redundancy between gene sets further obscures clear conclusions.

View Article and Find Full Text PDF

WebGestalt is a popular tool for the interpretation of gene lists derived from large scale -omics studies. In the 2019 update, WebGestalt supports 12 organisms, 342 gene identifiers and 155 175 functional categories, as well as user-uploaded functional databases. To address the growing and unique need for phosphoproteomics data interpretation, we have implemented phosphosite set analysis to identify important kinases from phosphoproteomics data.

View Article and Find Full Text PDF

We performed the first proteogenomic study on a prospectively collected colon cancer cohort. Comparative proteomic and phosphoproteomic analysis of paired tumor and normal adjacent tissues produced a catalog of colon cancer-associated proteins and phosphosites, including known and putative new biomarkers, drug targets, and cancer/testis antigens. Proteogenomic integration not only prioritized genomically inferred targets, such as copy-number drivers and mutation-derived neoantigens, but also yielded novel findings.

View Article and Find Full Text PDF

Background And Aims: Proteomics holds promise for individualizing cancer treatment. We analyzed to what extent the proteomic landscape of human colorectal cancer (CRC) is maintained in established CRC cell lines and the utility of proteomics for predicting therapeutic responses.

Methods: Proteomic and transcriptomic analyses were performed on 44 CRC cell lines, compared against primary CRCs (n=95) and normal tissues (n=60), and integrated with genomic and drug sensitivity data.

View Article and Find Full Text PDF

Functional enrichment analysis has played a key role in the biological interpretation of high-throughput omics data. As a long-standing and widely used web application for functional enrichment analysis, WebGestalt has been constantly updated to satisfy the needs of biologists from different research areas. WebGestalt 2017 supports 12 organisms, 324 gene identifiers from various databases and technology platforms, and 150 937 functional categories from public databases and computational analyses.

View Article and Find Full Text PDF

Motivation: Recent completion of the global proteomic characterization of The Cancer Genome Atlas (TCGA) colorectal cancer (CRC) cohort resulted in the first tumor dataset with complete molecular measurements at DNA, RNA and protein levels. Using CRC as a paradigm, we describe the application of the NetGestalt framework to provide easy access and interpretation of multi-omics data.

Results: The NetGestalt CRC portal includes genomic, epigenomic, transcriptomic, proteomic and clinical data for the TCGA CRC cohort, data from other CRC tumor cohorts and cell lines, and existing knowledge on pathways and networks, giving a total of more than 17 million data points.

View Article and Find Full Text PDF

Metastatic recurrence is the leading cause of cancer-related deaths in patients with colorectal carcinoma. To capture the molecular underpinnings for metastasis and tumor progression, we performed integrative network analysis on 11 independent human colorectal cancer gene expression datasets and applied expression data from an immunocompetent mouse model of metastasis as an additional filter for this biologic process. In silico analysis of one metastasis-related coexpression module predicted nuclear factor of activated T-cell (NFAT) transcription factors as potential regulators for the module.

View Article and Find Full Text PDF

Extensive genomic characterization of human cancers presents the problem of inference from genomic abnormalities to cancer phenotypes. To address this problem, we analysed proteomes of colon and rectal tumours characterized previously by The Cancer Genome Atlas (TCGA) and perform integrated proteogenomic analyses. Somatic variants displayed reduced protein abundance compared to germline variants.

View Article and Find Full Text PDF

Both transcriptional subtype and signaling network analyses have proved useful in cancer genomics research. However, these two approaches are usually applied in isolation in existing studies. We reason that deciphering genomic alterations based on cancer transcriptional subtypes may help reveal subtype-specific driver networks and provide insights for the development of personalized therapeutic strategies.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessiontg7rn7d600j3cqthue24slg0tdgie073): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once