The Gene Expression Omnibus (GEO) has millions of samples from thousands of studies. While users of GEO can search the metadata describing studies, there is a need for methods to search GEO at the data level. RummaGEO is a gene expression signature search engine for human and mouse RNA sequencing perturbation studies extracted from GEO.
View Article and Find Full Text PDFMotivation: There is a rapid growth in the production of omics datasets collected by the diabetes research community. However, such published data are underutilized for knowledge discovery. To make bioinformatics tools and published omics datasets from the diabetes field more accessible to biomedical researchers, we developed the Diabetes Data and Hypothesis Hub (D2H2).
View Article and Find Full Text PDFBackground: Gene-gene co-expression correlations measured by mRNA-sequencing (RNA-seq) can be used to predict gene annotations based on the co-variance structure within these data. In our prior work, we showed that uniformly aligned RNA-seq co-expression data from thousands of diverse studies is highly predictive of both gene annotations and protein-protein interactions. However, the performance of the predictions varies depending on whether the gene annotations and interactions are cell type and tissue specific or agnostic.
View Article and Find Full Text PDFLong non-coding ribonucleic acids (lncRNAs) account for the largest group of non-coding RNAs. However, knowledge about their function and regulation is limited. lncHUB2 is a web server database that provides known and inferred knowledge about the function of 18 705 human and 11 274 mouse lncRNAs.
View Article and Find Full Text PDFOwing to condensed development processes, expanding evidence and differences in healthcare system characteristics, many COVID-19 guidelines differ in their quality and treatment recommendations, which has consequences for clinical practice. This review aimed to identify COVID-19 treatment guidelines, assess their quality and summarise their recommendations. Guidelines were identified for five therapies most commonly used among inpatients with COVID-19 (remdesivir, dexamethasone, tocilizumab, baricitinib and casirivimab/imdevimab) from 11 countries.
View Article and Find Full Text PDFThe Library of Integrated Network-based Cellular Signatures (LINCS) was an NIH Common Fund program that aimed to expand our knowledge about human cellular responses to chemical, genetic, and microenvironment perturbations. Responses to perturbations were measured by transcriptomics, proteomics, cellular imaging, and other high content assays. The second phase of the LINCS program, which lasted 7 years, involved the engagement of six data and signature generation centers (DSGCs) and one data coordination and integration center (DCIC).
View Article and Find Full Text PDFMillions of transcriptome samples were generated by the Library of Integrated Network-based Cellular Signatures (LINCS) program. When these data are processed into searchable signatures along with signatures extracted from Genotype-Tissue Expression (GTEx) and Gene Expression Omnibus (GEO), connections between drugs, genes, pathways and diseases can be illuminated. SigCom LINCS is a webserver that serves over a million gene expression signatures processed, analyzed, and visualized from LINCS, GTEx, and GEO.
View Article and Find Full Text PDFBackground: PubMed contains millions of abstracts that co-mention terms that describe drugs with other biomedical terms such as genes or diseases. Unique opportunities exist for leveraging these co-mentions by integrating them with other drug-drug similarity resources such as the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 signatures to develop novel hypotheses.
Results: DrugShot is a web-based server application and an Appyter that enables users to enter any biomedical search term into a simple input form to receive ranked lists of drugs and other small molecules based on their relevance to the search term.
Motivation: The identification of pathways and biological processes from differential gene expression is central for interpretation of data collected by transcriptomics assays. Gene set enrichment analysis (GSEA) is the most commonly used algorithm to calculate the significance of the relevancy of an annotated gene set with a differential expression signature. To compute significance, GSEA implements permutation tests which are slow and inaccurate for comparing many differential expression signatures to thousands of annotated gene sets.
View Article and Find Full Text PDFThe Illuminating the Druggable Genome (IDG) consortium is a National Institutes of Health (NIH) Common Fund program designed to enhance our knowledge of under-studied proteins, more specifically, proteins unannotated within the three most commonly drug-targeted protein families: G-protein coupled receptors, ion channels, and protein kinases. Since 2014, the IDG Knowledge Management Center (IDG-KMC) has generated several open-access datasets and resources that jointly serve as a highly translational machine-learning-ready knowledgebase focused on human protein-coding genes and their products. The goal of the IDG-KMC is to develop comprehensive integrated knowledge for the druggable genome to illuminate the uncharacterized or poorly annotated portion of the druggable genome.
View Article and Find Full Text PDFPhosphoproteomics and proteomics experiments capture a global snapshot of the cellular signaling network, but these methods do not directly measure kinase state. Kinase Enrichment Analysis 3 (KEA3) is a webserver application that infers overrepresentation of upstream kinases whose putative substrates are in a user-inputted list of proteins. KEA3 can be applied to analyze data from phosphoproteomics and proteomics studies to predict the upstream kinases responsible for observed differential phosphorylations.
View Article and Find Full Text PDFUnderstanding the underlying molecular and structural similarities between seemingly heterogeneous sets of drugs can aid in identifying drug repurposing opportunities and assist in the discovery of novel properties of preclinical small molecules. A wealth of information about drug and small molecule structure, targets, indications and side effects; induced gene expression signatures; and other attributes are publicly available through web-based tools, databases and repositories. By processing, abstracting and aggregating information from these resources into drug set libraries, knowledge about novel properties of drugs and small molecules can be systematically imputed with machine learning.
View Article and Find Full Text PDFProfiling samples from patients, tissues, and cells with genomics, transcriptomics, epigenomics, proteomics, and metabolomics ultimately produces lists of genes and proteins that need to be further analyzed and integrated in the context of known biology. Enrichr (Chen et al., 2013; Kuleshov et al.
View Article and Find Full Text PDFJupyter Notebooks have transformed the communication of data analysis pipelines by facilitating a modular structure that brings together code, markdown text, and interactive visualizations. Here, we extended Jupyter Notebooks to broaden their accessibility with Appyters. Appyters turn Jupyter Notebooks into fully functional standalone web-based bioinformatics applications.
View Article and Find Full Text PDFBackground: Lysosomal Storage Diseases (LSD) are rare and multisytemic diseases which are caused by lysosomal enzyme deficiencies leading into accumulation of waste products due to an interruption in the decomposition process. Due to the low prevalence and therefore limited disease awareness as well as the fact that LSD patients present with unspecific symptoms the final diagnosis is often made after a long delay. The aim of this German-wide survey was to characterize the period between onset of symptoms and final diagnosis regarding e.
View Article and Find Full Text PDFTumor-specific elucidation of physical and functional oncoprotein interactions could improve tumorigenic mechanism characterization and therapeutic response prediction. Current interaction models and pathways, however, lack context specificity and are not oncoprotein specific. We introduce SigMaps as context-specific networks, comprising modulators, effectors and cognate binding-partners of a specific oncoprotein.
View Article and Find Full Text PDFBackground: Hereditary angioedema (HAE) is a rare genetic disease and characterized by clinical features such as paroxysmal, recurrent angioedema of the skin, the gastrointestinal tract, and the upper airways. Swelling of the skin occurs primarily in the face, extremities and genitals. Gastrointestinal attacks are accompanied by painful abdominal cramps, vomiting and diarrhea.
View Article and Find Full Text PDFIn a short period, many research publications that report sets of experimentally validated drugs as potential COVID-19 therapies have emerged. To organize this accumulating knowledge, we developed the COVID-19 Drug and Gene Set Library (https://amp.pharm.
View Article and Find Full Text PDFThe coronavirus (CoV) severe acute respiratory syndrome (SARS)-CoV-2 (COVID-19) pandemic has received rapid response by the research community to offer suggestions for repurposing of approved drugs as well as to improve our understanding of the COVID-19 viral life cycle molecular mechanisms. In a short period, tens of thousands of research preprints and other publications have emerged including those that report lists of experimentally validated drugs and compounds as potential COVID-19 therapies. In addition, gene sets from interacting COVID-19 virus-host proteins and differentially expressed genes when comparing infected to uninfected cells are being published at a fast rate.
View Article and Find Full Text PDFGenetic variants are the primary driver of congenital heart disease (CHD) pathogenesis. However, our ability to identify causative variants is limited. To identify causal CHD genes that are associated with specific molecular functions, the study used prior knowledge to filter de novo variants from 2,881 probands with sporadic severe CHD.
View Article and Find Full Text PDFMotivation: Micro-blogging with Twitter to communicate new results, discuss ideas and share techniques is becoming central. While most Twitter users are real people, the Twitter API provides the opportunity to develop Twitter bots and to analyze global trends in tweets.
Results: EnrichrBot is a bot that tracks and tweets information about human genes implementing six principal functions: (i) tweeting information about under-studied genes including non-coding lncRNAs, (ii) replying to requests for information about genes, (iii) responding to GWASbot, another bot that tweets Manhattan plots from genome-wide association study analysis of the UK Biobank, (iv) tweeting randomly selected gene sets from the Enrichr database for analysis with Enrichr, (v) responding to mentions of human genes in tweets with additional information about these genes and (vi) tweeting a weekly report about the most trending genes on Twitter.
Hematopoietic stem cells (HSCs) exist in a dormant state and progressively lose regenerative potency as they undergo successive divisions. Why this functional decline occurs and how this information is encoded is unclear. To better understand how this information is stored, we performed RNA sequencing on HSC populations differing only in their divisional history.
View Article and Find Full Text PDFRNA-Sequencing (RNA-Seq) is currently the leading technology for genome-wide transcript quantification. Mapping the raw reads to transcript and gene level counts can be achieved by different aligners. Here we report an in-depth comparison of transcript quantification methods.
View Article and Find Full Text PDFElimination of chronic hepatitis C (HCV) will require scaling up treatment, including possible HCV treatment by primary care providers. The District of Columbia (DC) has a substantial population living with untreated hepatitis C, and treatment expansion would benefit the resident population. The aim of this study was to assess the knowledge, attitudes, and behaviors of primary care providers and specialists related to hepatitis C screening and treatment.
View Article and Find Full Text PDF