Evolution has developed a set of principles that determine feasible domain combinations analogous to grammar within natural languages. Treating domains as words and proteins as sentences, made up of words, we apply a linguistic approach to represent the human proteome as an n-gram network. Combining this with network theory and application, we explore the functional language and rules of the human proteome.
View Article and Find Full Text PDFProtein domains are conserved structural and functional units and are the functional building blocks of proteins. Evolutionary expansion means that domain families are often represented by many members in a species, which are found in various configurations with other domains, which have evolved new specificity for interacting partners. Here, we develop a structure-based interface analysis to comprehensively map domain interfaces from available experimental and predicted structures, including interfaces with other macromolecules and intraprotein interfaces (such as might exist between domains in a protein).
View Article and Find Full Text PDFFibroblasts are essential regulators of extracellular matrix deposition following cardiac injury. These cells exhibit highly plastic responses in phenotype during fibrosis in response to environmental stimuli. Here, we test whether and how candidate anti-fibrotic drugs differentially regulate measures of cardiac fibroblast phenotype, which may help identify treatments for cardiac fibrosis.
View Article and Find Full Text PDFPost-translational modifications (PTMs) and splicing are known to be important regulatory processes for controlling protein function and activity. However, there have been limitations in analyzing the interplay of alternative splicing and PTMs, which stems from the deep differences in genomic and proteomic databases. In this work, we bridged the protein- and genome-centric world views to map PTMs to genomic locations for subsequent projection of PTMs onto alternative isoforms.
View Article and Find Full Text PDFFibroblasts are essential regulators of extracellular matrix deposition following cardiac injury. These cells exhibit highly plastic responses in phenotype during fibrosis in response to environmental stimuli. Here, we test whether and how candidate anti-fibrotic drugs differentially regulate measures of cardiac fibroblast phenotype, which may help identify treatments for cardiac fibrosis.
View Article and Find Full Text PDFKinase inhibitors as targeted therapies have played an important role in improving cancer outcomes. However, there are still considerable challenges, such as resistance, non-response, patient stratification, polypharmacology, and identifying combination therapy where understanding a tumor kinase activity profile could be transformative. Here, we develop a graph- and statistics-based algorithm, called KSTAR, to convert phosphoproteomic measurements of cells and tissues into a kinase activity score that is generalizable and useful for clinical pipelines, requiring no quantification of the phosphorylation sites.
View Article and Find Full Text PDFTyrosine and serine/threonine kinases are essential regulators of cell processes and are important targets for human therapies. Unfortunately, very little is known about specific kinase-substrate relationships, making it difficult to infer meaning from dysregulated phosphoproteomic datasets or for researchers to identify possible kinases that regulate specific or novel phosphorylation sites. The last two decades have seen an explosion in algorithms to extrapolate from what little is known into the larger unknown-predicting kinase relationships with site-specific substrates using a variety of approaches that include the sequence-specificity of kinase catalytic domains and various other factors, such as evolutionary relationships, co-expression, and protein-protein interaction networks.
View Article and Find Full Text PDFMost patients with glioblastoma (GBM) die within 2 years. A major therapeutic goal is to target GBM stem cells (GSCs), a subpopulation of cells that contribute to treatment resistance and recurrence. Since their discovery in 2003, GSCs have been isolated using single-surface markers, such as CD15, CD44, CD133, and α6 integrin.
View Article and Find Full Text PDFProtein domain interactions with short linear peptides, such as those of the Src homology 2 (SH2) domain with phosphotyrosine-containing peptide motifs (pTyr), are ubiquitous and important to many biochemical processes of the cell. The desire to map and quantify these interactions has resulted in the development of high-throughput (HTP) quantitative measurement techniques, such as microarray or fluorescence polarization assays. For example, in the last 15 years, experiments have progressed from measuring single interactions to covering 500,000 of the 5.
View Article and Find Full Text PDFWe present ProteoClade, a Python toolkit that performs taxa-specific peptide assignment, protein inference, and quantitation for multi-species proteomics experiments. ProteoClade scales to hundreds of millions of protein sequences, requires minimal computational resources, and is open source, multi-platform, and accessible to non-programmers. We demonstrate its utility for processing quantitative proteomic data derived from patient-derived xenografts and its speed and scalability enable a novel de novo proteomic workflow for complex microbiota samples.
View Article and Find Full Text PDFEvolutionary reconstruction algorithms produce models of the evolutionary history of proteins or species. Such algorithms are highly sensitive to their inputs: the sequences used and their alignments. Here, we asked whether the variance introduced by selecting different input sequences could be used to better identify accurate evolutionary models.
View Article and Find Full Text PDFRationale: Mutations in the SCN5A gene, encoding the α subunit of the Nav1.5 channel, cause a life-threatening form of cardiac arrhythmia, long QT syndrome type 3 (LQT3). Mexiletine, which is structurally related to the Na channel-blocking anesthetic lidocaine, is used to treat LQT3 patients.
View Article and Find Full Text PDFRRM, or RNA-recognition motif, domains are the largest class of single-stranded RNA binding domains in the human proteome and play important roles in RNA processing, splicing, export, stability, packaging, and degradation. Using a current database of post-translational modifications (PTMs), ProteomeScout, we found that RRM domains are also one of the most heavily modified domains in the human proteome. Here, we present two interesting findings about RRM domain modifications, found by mapping known PTMs onto RRM domain alignments and structures.
View Article and Find Full Text PDFCurr Protoc Bioinformatics
September 2017
Post-translational modifications (PTMs) of protein amino acids are ubiquitous and important to protein function, localization, degradation, and more. In recent years, there has been an explosion in the discovery of PTMs as a result of improvements in PTM measurement techniques, including quantitative measurements of PTMs across multiple conditions. ProteomeScout is a repository for such discovery and quantitative experiments and provides tools for visualizing PTMs within proteins, including where they are relative to other PTMS, domains, mutations, and structure.
View Article and Find Full Text PDFOverexpression of HER2, a receptor tyrosine kinase of the ERBB family, in breast cancer is related to increased cancer progression and aggressiveness. A breast epithelial cell model with the single perturbation of HER2 overexpression is capable of replicating the increased aggressiveness of HER2 overexpressing cancers. In previous work, Wolf-Yadlin and colleagues (Wolf-Yadlin et al.
View Article and Find Full Text PDFSince the advent of large-scale genomic sequencing, and the consequent availability of large numbers of homologous protein sequences, there has been burgeoning development of methods for extracting functional information from multiple sequence alignments (MSAs). One type of analysis seeks to identify specificity determining positions (SDPs) based on the assumption that such positions are highly conserved within groups of sequences sharing functional specificity, but conserved to different amino acids in different specificity groups. This unsupervised approach to utilizing evolutionary information may elucidate mechanisms of specificity in protein-protein interactions, catalytic activity of enzymes, sensitivity to allosteric regulation, and other types of protein functionality.
View Article and Find Full Text PDFClustering is an unsupervised learning method, which groups data points based on similarity, and is used to reveal the underlying structure of data. This computational approach is essential to understanding and visualizing the complex data that are acquired in high-throughput multidimensional biological experiments. Clustering enables researchers to make biological inferences for further experiments.
View Article and Find Full Text PDFThe EGF receptor can bind seven different agonist ligands. Although each agonist appears to stimulate the same suite of downstream signaling proteins, different agonists are capable of inducing distinct responses in the same cell. To determine the basis for these differences, we used luciferase fragment complementation imaging to monitor the recruitment of Cbl, CrkL, Gab1, Grb2, PI3K, p52 Shc, p66 Shc, and Shp2 to the EGF receptor when stimulated by the seven EGF receptor ligands.
View Article and Find Full Text PDFBackground: Protein post-translational modifications (PTMs) are an important aspect of protein regulation. The number of PTMs discovered within the human proteome, and other proteomes, has been rapidly expanding in recent years. As a consequence of the rate in which new PTMs are identified, analysis done in one year may result in different conclusions when repeated in subsequent years.
View Article and Find Full Text PDFProteomeScout (https://proteomescout.wustl.edu) is a resource for the study of proteins and their post-translational modifications (PTMs) consisting of a database of PTMs, a repository for experimental data, an analysis suite for PTM experiments, and a tool for visualizing the relationships between complex protein annotations.
View Article and Find Full Text PDFReactivation of a silent transcriptional program is a critical step in successful axon regeneration following injury. Yet how such a program is unlocked after injury remains largely unexplored. We found that axon injury in peripheral sensory neurons elicits a back-propagating calcium wave that invades the soma and causes nuclear export of HDAC5 in a PKCμ-dependent manner.
View Article and Find Full Text PDFCollagen is an important extracellular matrix component that directs many fundamental cellular processes including differentiation, proliferation and motility. The signalling networks driving these processes are propagated by collagen receptors such as the β1 integrins and the DDRs (discoidin domain receptors). To gain an insight into the molecular mechanisms of collagen receptor signalling, we have performed a quantitative analysis of the phosphorylation networks downstream of collagen activation of integrins and DDR2.
View Article and Find Full Text PDF