Advancements in genomic and proteomic technologies have powered the creation of large gene and protein networks ("interactomes") for understanding biological systems. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 45 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks.
View Article and Find Full Text PDFGene set enrichment is a mainstay of functional genomics, but it relies on gene function databases that are incomplete. Here we evaluate five large language models (LLMs) for their ability to discover the common functions represented by a gene set, supported by molecular rationale and a self-confidence assessment. For curated gene sets from Gene Ontology, GPT-4 suggests functions similar to the curated name in 73% of cases, with higher self-confidence predicting higher similarity.
View Article and Find Full Text PDFIn recent decades, the development of new drugs has become increasingly expensive and inefficient, and the molecular mechanisms of most pharmaceuticals remain poorly understood. In response, computational systems and network medicine tools have emerged to identify potential drug repurposing candidates. However, these tools often require complex installation and lack intuitive visual network mining capabilities.
View Article and Find Full Text PDFAdvancements in genomic and proteomic technologies have powered the use of gene and protein networks ("interactomes") for understanding genotype-phenotype translation. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 46 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks.
View Article and Find Full Text PDFGene set analysis is a mainstay of functional genomics, but it relies on manually curated databases of gene functions that are incomplete and unaware of biological context. Here we evaluate the ability of OpenAI's GPT-4, a Large Language Model (LLM), to develop hypotheses about common gene functions from its embedded biomedical knowledge. We created a GPT-4 pipeline to label gene sets with names that summarize their consensus functions, substantiated by analysis text and citations.
View Article and Find Full Text PDFGene set analysis is a mainstay of functional genomics, but it relies on curated databases of gene functions that are incomplete. Here we evaluate five Large Language Models (LLMs) for their ability to discover the common biological functions represented by a gene set, substantiated by supporting rationale, citations and a confidence assessment. Benchmarking against canonical gene sets from the Gene Ontology, GPT-4 confidently recovered the curated name or a more general concept (73% of cases), while benchmarking against random gene sets correctly yielded zero confidence.
View Article and Find Full Text PDFThe DNA damage response (DDR) ensures error-free DNA replication and transcription and is disrupted in numerous diseases. An ongoing challenge is to determine the proteins orchestrating DDR and their organization into complexes, including constitutive interactions and those responding to genomic insult. Here, we use multi-conditional network analysis to systematically map DDR assemblies at multiple scales.
View Article and Find Full Text PDFCytoscape is an open-source bioinformatics environment for the analysis, integration, visualization, and query of biological networks. In this perspective piece, we describe our project to bring the Cytoscape desktop application to the web while explaining our strategy in ways relevant to others in the bioinformatics community. We examine opportunities and challenges in developing bioinformatics software that spans both the desktop and web, and we describe our ongoing efforts to build a Cytoscape web application, highlighting the principles that guide our development.
View Article and Find Full Text PDFMotivation: The investigation of sets of genes using biological pathways is a common task for researchers and is supported by a wide variety of software tools. This type of analysis generates hypotheses about the biological processes that are active or modulated in a specific experimental context.
Results: The Network Data Exchange Integrated Query (NDEx IQuery) is a new tool for network and pathway-based gene set interpretation that complements or extends existing resources.
Investigator-generated transcriptomic datasets interrogating circulating immune cell (CIC) gene expression in clinical type 1 diabetes (T1D) have underappreciated re-use value. Here, we repurposed these datasets to create an open science environment for the generation of hypotheses around CIC signaling pathways whose gain or loss of function contributes to T1D pathogenesis. We firstly computed sets of genes that were preferentially induced or repressed in T1D CICs and validated these against community benchmarks.
View Article and Find Full Text PDFA major goal of cancer research is to understand how mutations distributed across diverse genes affect common cellular systems, including multiprotein complexes and assemblies. Two challenges—how to comprehensively map such systems and how to identify which are under mutational selection—have hindered this understanding. Accordingly, we created a comprehensive map of cancer protein systems integrating both new and published multi-omic interaction data at multiple scales of analysis.
View Article and Find Full Text PDFNDEx, the Network Data Exchange (https://www.ndexbio.org) is a web-based resource where users can find, store, share and publish network models of any type and size.
View Article and Find Full Text PDFEstablishing consensus around the transcriptional interface between coronavirus (CoV) infection and human cellular signaling pathways can catalyze the development of novel anti-CoV therapeutics. Here, we used publicly archived transcriptomic datasets to compute consensus regulatory signatures, or consensomes, that rank human genes based on their rates of differential expression in MERS-CoV (MERS), SARS-CoV-1 (SARS1) and SARS-CoV-2 (SARS2)-infected cells. Validating the CoV consensomes, we show that high confidence transcriptional targets (HCTs) of CoV infection intersect with HCTs of signaling pathway nodes with known roles in CoV infection.
View Article and Find Full Text PDFNetworks are a powerful and flexible paradigm that facilitate communication and computation about interactions of any type, whether social, economic, or biological. NDEx, the Network Data Exchange, is an online commons to enable new modes of collaboration and publication using biological networks. NDEx creates an access point and interface to a broad range of networks, whether they express molecular interactions, curated relationships from literature, or the outputs of systematic analysis of big data.
View Article and Find Full Text PDFNetworks are a powerful and flexible methodology for expressing biological knowledge for computation and communication. Network-encoded information can include systematic screens for molecular interactions, biological relationships curated from literature, and outputs from analysis of Big Data. NDEx, the Network Data Exchange (www.
View Article and Find Full Text PDFWorld J Biol Chem
September 2010
The Pdx-1 transcription factor plays crucial functions both during pancreas development and in the adult beta cells. Previous studies have indicated that ectopic Pdx-1 expression in liver or intestinal primary and immortalized cells is sufficient to promote activation of insulin gene expression. This work is focused on the molecular and physiological consequences of Pdx-1 overexpression in liver cells.
View Article and Find Full Text PDFDetergent-resistant membranes (DRMs) represent specialized membrane domains resistant to detergent extraction, which may serve to segregate proteins in a specific environment in order to improve their function. Segregation of glycosylphosphatidylinositol-anchored proteins (GPI-APs) in DRMs has been shown to be involved in their sorting to the apical membrane in polarized epithelial cells. Nonetheless, we have shown that both apical and basolateral GPI-APs associate with DRMs.
View Article and Find Full Text PDFIt is shown in literature that stress, such as deprivation of trophic factors and hypoxia, induces apoptosis in cultured cells and in tissues. In light of these results, we explored the possibility of protecting cells from programmed death by improving the metabolism of the mitochondrion. To this end, acetyl-L-carnitine was administered at various concentrations under conditions of serum deprivation.
View Article and Find Full Text PDFAn essential but insufficient step for apical sorting of glycosylphosphatidylinositol (GPI)-anchored proteins (GPI-APs) in epithelial cells is their association with detergent-resistant microdomains (DRMs) or rafts. In this paper, we show that in MDCK cells both apical and basolateral GPI-APs associate with DRMs during their biosynthesis. However, only apical and not basolateral GPI-APs are able to oligomerize into high molecular weight complexes.
View Article and Find Full Text PDF