The Protein Kinase Ontology (ProKinO) is an integrated knowledge graph that conceptualizes the complex relationships among protein kinase sequence, structure, function, and disease in a human and machine-readable format. In this study, we have significantly expanded ProKinO by incorporating additional data on expression patterns and drug interactions. Furthermore, we have developed a completely new browser from the ground up to render the knowledge graph visible and interactive on the web.
View Article and Find Full Text PDFThe 534 protein kinases encoded in the human genome constitute a large druggable class of proteins that include both well-studied and understudied "dark" members. Accurate prediction of dark kinase functions is a major bioinformatics challenge. Here, we employ a graph mining approach that uses the evolutionary and functional context encoded in knowledge graphs (KGs) to predict protein and pathway associations for understudied kinases.
View Article and Find Full Text PDFPseudokinases regulate diverse cellular processes associated with normal cellular functions and disease. They are defined bioinformatically based on the absence of one or more catalytic residues that are required for canonical protein kinase functions. The ability to define pseudokinases based on primary sequence comparison has enabled the systematic mapping and cataloging of pseudokinase orthologs across the tree of life.
View Article and Find Full Text PDFGlycosyltransferases (GTs) play a central role in sustaining all forms of life through the biosynthesis of complex carbohydrates. Despite significant strides made in recent years to establish computational resources, databases and tools to understand the nature and role of carbohydrates and related glycoenzymes, a data analytics framework that connects the sequence-structure-function relationships to the evolution of GTs is currently lacking. This hinders the characterization of understudied GTs and the synthetic design of GTs for medical and biotechnology applications.
View Article and Find Full Text PDFMany bioinformatics resources with unique perspectives on the protein landscape are currently available. However, generating new knowledge from these resources requires interoperable workflows that support cross-resource queries. In this study, we employ federated queries linking information from the Protein Kinase Ontology, iPTMnet, Protein Ontology, neXtProt, and the Mouse Genome Informatics to identify key knowledge gaps in the functional coverage of the human kinome and prioritize understudied kinases, cancer variants and post-translational modifications (PTMs) for functional studies.
View Article and Find Full Text PDFMultiple sequence alignments (MSAs) are a fundamental analysis tool used throughout biology to investigate relationships between protein sequence, structure, function, evolutionary history, and patterns of disease-associated variants. However, their widespread application in systems biology research is currently hindered by the lack of user-friendly tools to simultaneously visualize, manipulate and query the information conceptualized in large sequence alignments, and the challenges in integrating MSAs with multiple orthogonal data such as cancer variants and post-translational modifications, which are often stored in heterogeneous data sources and formats. Here, we present the Multiple Sequence Alignment Ontology (MSAOnt), which represents a profile or consensus alignment in an ontological format.
View Article and Find Full Text PDF