Influence relevance voting: an accurate and interpretable virtual high throughput screening method.

J Chem Inf Model

School of Information and Computer Sciences, Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697-3435, USA.

Published: April 2009

Given activity training data from high-throughput screening (HTS) experiments, virtual high-throughput screening (vHTS) methods aim to predict in silico the activity of untested chemicals. We present a novel method, the Influence Relevance Voter (IRV), specifically tailored for the vHTS task. The IRV is a low-parameter neural network which refines a k-nearest neighbor classifier by nonlinearly combining the influences of a chemical's neighbors in the training set. Influences are decomposed, also nonlinearly, into a relevance component and a vote component. The IRV is benchmarked using the data and rules of two large, open, competitions, and its performance compared to the performance of other participating methods, as well as of an in-house support vector machine (SVM) method. On these benchmark data sets, IRV achieves state-of-the-art results, comparable to the SVM in one case, and significantly better than the SVM in the other, retrieving three times as many actives in the top 1% of its prediction-sorted list. The IRV presents several other important advantages over SVMs and other methods: (1) the output predictions have a probabilistic semantic; (2) the underlying inferences are interpretable; (3) the training time is very short, on the order of minutes even for very large data sets; (4) the risk of overfitting is minimal, due to the small number of free parameters; and (5) additional information can easily be incorporated into the IRV architecture. Combined with its performance, these qualities make the IRV particularly well suited for vHTS.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2750043PMC
http://dx.doi.org/10.1021/ci8004379DOI Listing

Publication Analysis

Top Keywords

influence relevance
8
high-throughput screening
8
data sets
8
irv
7
relevance voting
4
voting accurate
4
accurate interpretable
4
interpretable virtual
4
virtual high
4
high throughput
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!