Tuberculosis (TB) is still a major global health challenge, killing over 1.5 million people each year, and hence, there is a need to identify and develop novel treatments for Mycobacterium tuberculosis (M. tuberculosis).
View Article and Find Full Text PDFIntroduction: The advancement of the Army's National Emergency Tele-Critical Care Network (NETCCN) and planned evolution to an Intelligent Medical System rest on a digital transformation characterized by the application of analytic rigor anchored and machine learning.The goal is an enduring capability for telecritical care in support of the Nation's warfighters and, more broadly, for emergency response, crisis management, and mass casualty situations as the number and intensity of disasters increase nationwide. That said, technology alone is unlikely to solve the most pressing issues in operational medicine and combat casualty care.
View Article and Find Full Text PDFIn this paper, we introduce the Analysis Platform for Risk, Resilience, and Expenditure in Disasters (APRED)-a disaster-analytic platform developed for crisis practitioners and economic developers across the United States (US). APRED provides practitioners with a centralized platform for exploring disaster resilience and vulnerability profiles of all counties across the US. The platform comprises five sections including: (1) Disaster Resilience Index, (2) Business Vulnerability Index, (3) Disaster Declaration History, (4) County Profile, and (5) Storm History sections.
View Article and Find Full Text PDFBackground: LINCS, "Library of Integrated Network-based Cellular Signatures", and IDG, "Illuminating the Druggable Genome", are both NIH projects and consortia that have generated rich datasets for the study of the molecular basis of human health and disease. LINCS L1000 expression signatures provide unbiased systems/omics experimental evidence. IDG provides compiled and curated knowledge for illumination and prioritization of novel drug target hypotheses.
View Article and Find Full Text PDFMotivation: Genome-wide association studies can reveal important genotype-phenotype associations; however, data quality and interpretability issues must be addressed. For drug discovery scientists seeking to prioritize targets based on the available evidence, these issues go beyond the single study.
Results: Here, we describe rational ranking, filtering and interpretation of inferred gene-trait associations and data aggregation across studies by leveraging existing curation and harmonization efforts.
Background: Netpredictor is an R package for prediction of missing links in any given unipartite or bipartite network. The package provides utilities to compute missing links in a bipartite and well as unipartite networks using Random Walk with Restart and Network inference algorithm and a combination of both. The package also allows computation of Bipartite network properties, visualization of communities for two different sets of nodes, and calculation of significant interactions between two sets of nodes using permutation based testing.
View Article and Find Full Text PDFTuberculosis (TB) is the world's leading infectious killer with 1.8 million deaths in 2015 as reported by WHO. It is therefore imperative that alternate routes of identification of novel anti-TB compounds are explored given the time and costs involved in new drug discovery process.
View Article and Find Full Text PDFBackground: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources.
View Article and Find Full Text PDFBackground: Highly chemically similar drugs usually possess similar biological activities, but sometimes, small changes in chemistry can result in a large difference in biological effects. Chemically similar drug pairs that show extreme deviations in activity represent distinctive drug interactions having important implications. These associations between chemical and biological similarity are studied as discontinuities in activity landscapes.
View Article and Find Full Text PDFBackground: Predicting novel drug-target associations is important not only for developing new drugs, but also for furthering biological knowledge by understanding how drugs work and their modes of action. As more data about drugs, targets, and their interactions becomes available, computational approaches have become an indispensible part of drug target association discovery. In this paper we apply random walk with restart (RWR) method to a heterogeneous network of drugs and targets compiled from DrugBank database and investigate the performance of the method under parameter variation and choice of chemical fingerprint methods.
View Article and Find Full Text PDFPhenotypic assays have a proven track record for generating leads that become first-in-class therapies. Whole cell assays that inform on a phenotype or mechanism also possess great potential in drug repositioning studies by illuminating new activities for the existing pharmacopeia. The National Center for Advancing Translational Sciences (NCATS) pharmaceutical collection (NPC) is the largest reported collection of approved small molecule therapeutics that is available for screening in a high-throughput setting.
View Article and Find Full Text PDFMol Inform
December 2013
Effective discovery of new drugs for complex diseases demands an integrative analysis of big data aggregated from diverse sources in chemical and biological domains, to help better understand the mechanism of drug actions and to quickly translate discovery to clinical applications. Conventional approaches are confronting critical challenges in the integration of those huge heterogeneous datasets and the rapid transformation from data to knowledge. Semantic technologies aimed at facilitating the building of a common framework that allows data sharing and utilization across applications and domains in the web, have been developed quickly and have been exhibiting a broad impact in life science.
View Article and Find Full Text PDFBackground: Making data available as Linked Data using Resource Description Framework (RDF) promotes integration with other web resources. RDF documents can natively link to related data, and others can link back using Uniform Resource Identifiers (URIs). RDF makes the data machine-readable and uses extensible vocabularies for additional information, making it easier to scale up inference and data analysis.
View Article and Find Full Text PDFElectronic laboratory notebooks (ELNs) are increasingly replacing paper notebooks in life science laboratories, including those in industry, academic settings, and hospitals. ELNs offer significant advantages over paper notebooks, but adopting them in a predominantly paper-based environment is usually disruptive. The benefits of ELN increase when they are integrated with other laboratory informatics tools such as laboratory information management systems, chromatography data systems, analytical instrumentation, and scientific data management systems, but there is no well-established path for effective integration of these tools.
View Article and Find Full Text PDFBackground: Mycobacterium tuberculosis encodes 11 putative serine-threonine proteins Kinases (STPK) which regulates transcription, cell development and interaction with the host cells. From the 11 STPKs three kinases namely PknA, PknB and PknG have been related to the mycobacterial growth. From previous studies it has been observed that PknB is essential for mycobacterial growth and expressed during log phase of the growth and phosphorylates substrates involved in peptidoglycan biosynthesis.
View Article and Find Full Text PDFAssociative classification mining (ACM) can be used to provide predictive models with high accuracy as well as interpretability. However, traditional ACM ignores the difference of significances among the features used for mining. Although weighted associative classification mining (WACM) addresses this issue by assigning different weights to features, most implementations can only be utilized when pre-assigned weights are available.
View Article and Find Full Text PDFJ Cheminform
November 2012
Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics.
View Article and Find Full Text PDFThere are technologies on the horizon that could dramatically change how informatics organizations design, develop, deliver, and support applications and data infrastructures to deliver maximum value to drug discovery organizations. Effective integration of data and laboratory informatics tools promises the ability of organizations to make better informed decisions about resource allocation during the drug discovery and development process and for more informed decisions to be made with respect to the market opportunity for compounds. We propose in this article a new integration model called ELN-centric laboratory informatics tools integration.
View Article and Find Full Text PDFThe rapidly increasing amount of public data in chemistry and biology provides new opportunities for large-scale data mining for drug discovery. Systematic integration of these heterogeneous sets and provision of algorithms to data mine the integrated sets would permit investigation of complex mechanisms of action of drugs. In this work we integrated and annotated data from public datasets relating to drugs, chemical compounds, protein targets, diseases, side effects and pathways, building a semantic linked network consisting of over 290,000 nodes and 720,000 edges.
View Article and Find Full Text PDFBackground: Experimental screening of chemical compounds for biological activity is a time consuming and expensive practice. In silico predictive models permit inexpensive, rapid "virtual screening" to prioritize selection of compounds for experimental testing. Both experimental and in silico screening can be used to test compounds for desirable or undesirable properties.
View Article and Find Full Text PDFBackground: Systems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology. We previously developed an RDF-based resource called Chem2Bio2RDF that enabled querying of such data using the SPARQL query language. Whilst this work has proved useful in its own right as one of the first major resources in these disciplines, its utility could be greatly improved by the application of an ontology for annotation of the nodes and edges in the RDF graph, enabling a much richer range of semantic queries to be issued.
View Article and Find Full Text PDFSystems chemical biology, the integration of chemistry, biology and computation to generate understanding about the way small molecules affect biological systems as a whole, as well as related fields such as chemogenomics, are central to emerging new paradigms of drug discovery such as drug repurposing and personalized medicine. Recent Semantic Web technologies such as RDF and SPARQL are technical enablers of systems chemical biology, facilitating the deployment of advanced algorithms for searching and mining large integrated datasets. In this paper, we aim to demonstrate how these technologies together can change the way that drug discovery is accomplished.
View Article and Find Full Text PDFMuch life science and biology research requires an understanding of complex relationships between biological entities (genes, compounds, pathways, diseases, and so on). There is a wealth of data on such relationships in publicly available datasets and publications, but these sources are overlapped and distributed so that finding pertinent relational data is increasingly difficult. Whilst most public datasets have associated tools for searching, there is a lack of searching methods that can cross data sources and that in particular search not only based on the biological entities themselves but also on the relationships between them.
View Article and Find Full Text PDFMotivation: Networks to predict protein pharmacology can be created using ligand similarity or using known bioassay response profiles of ligands. Recent publications indicate that similarity methods can be highly accurate, but it has been unclear how similarity methods compare to methods that use bioassay response data directly.
Results: We created protein networks based on ligand similarity (Similarity Ensemble Approach or SEA) and ligand bioassay response-data (BARD) using 155 Pfizer internal BioPrint assays.