Integrating Incompatible Assay Data Sets with Deep Preference Learning.

ACS Med Chem Lett

Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan.

Published: January 2022

A large amount of bioactivity assay data is already accumulated in public databases, but the integration of these data sets for quantitative structure-activity relationship (QSAR) studies is not straightforward due to differences in experimental methods and settings. We present an efficient deep-learning-based approach called Deep Preference Data Integration (DPDI). For integrating outcome variables of different assay types, a surrogate variable is introduced, and a neural network is trained such that the total order induced by the surrogate variable is maximally consistent with given data sets. In a task of predicting efficacy of factor Xa inhibitors, DPDI successfully integrated 2959 molecules distributed in 129 assay data sets. In most of our experiments, data integration improved prediction accuracy strongly in interpolation and extrapolation tasks, indicating that DPDI is an effective tool for QSAR studies.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762726PMC
http://dx.doi.org/10.1021/acsmedchemlett.1c00439DOI Listing

Publication Analysis

Top Keywords

data sets
16
assay data
12
deep preference
8
qsar studies
8
data integration
8
surrogate variable
8
data
7
integrating incompatible
4
assay
4
incompatible assay
4

Similar Publications

Background: Objective structured clinical examinations (OSCEs) are a widely recognized and accepted method to assess clinical competencies but are often resource-intensive.

Objective: This study aimed to evaluate the feasibility and effectiveness of a virtual reality (VR)-based station (VRS) compared with a traditional physical station (PHS) in an already established curricular OSCE.

Methods: Fifth-year medical students participated in an OSCE consisting of 10 stations.

View Article and Find Full Text PDF

Diffraction imaging of cells allows rapid phenotyping by the response of intracellular molecules to coherent illumination. However, its ability to distinguish numerous types of human leukocytes remains to be investigated. Here, we show that accurate classification of three lymphocyte subtypes can be achieved with features extracted from cross-polarized diffraction image (p-DI) pairs.

View Article and Find Full Text PDF

ModeHunter is a modular Python software package for the simulation of 3D biophysical motion across spatial resolution scales using modal analysis of elastic networks. It has been curated from our in-house Python scripts over the last 15 years, with a focus on detecting similarities of elastic motion between atomic structures, coarse-grained graphs, and volumetric data obtained from biophysical or biomedical imaging origins, such as electron microscopy or tomography. With ModeHunter, normal modes of biophysical motion can be analyzed with various static visualization techniques or brought to life by dynamics animation in terms of single or multimode trajectories or decoy ensembles.

View Article and Find Full Text PDF

Naturopathic practitioners consult an estimated 6.2% of Australian adults, equating to 1,550,000 people receiving their care each year. Sleep is now recognized as a key pillar of health; however, nearly half of all Australian adults report inadequate sleep.

View Article and Find Full Text PDF

The reduced cost of next-generation sequencing (NGS) has allowed researchers to generate nuclear and mitochondrial genome data to gain deeper insights into the phylogeography, evolutionary history and biology of non-model species. While the Cape buffalo () has been well-studied across its range with traditional genetic markers over the last 25 years, researchers are building on this knowledge by generating whole genome, population-level data sets to improve understanding of the genetic composition and evolutionary history of the species. Using publicly available NGS data, we assembled 40 Cape buffalo mitochondrial genomes (mitogenomes) from four protected areas in South Africa, expanding the geographical range and almost doubling the number of mitogenomes available for this species.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!