We demonstrate a new approach to the development of scoring functions through the formulation and parameterization of a new function, which can be used both for rapidly ranking the binding of ligands to proteins and for estimating relative aqueous molecular solubilities. The intent of this work is to introduce a new paradigm for creation of scoring functions, wherein we impose the following criteria upon the function: (1) simple; (2) intuitive; (3) requires no postparameterization tweaking; (4) can be applied (without reparameterization) to multiple target systems; and (5) can be rapidly evaluated for any potential ligand. Following these criteria, a new function, FURSMASA (function for rapid scoring using an MD-averaged grid and the accessible surface area) has been developed. Three novel features of the function include: (1) use of an MD-averaged potential energy grid for ligand-protein interactions, rather than a simple static grid; (2) inclusion of a term that depends on the change in the solvent-accessible surface area changes on an atomic (not molecular) basis; and (3) use of the recently derived predictive index (PI) target when optimizing the function, which focuses the function on its intended purpose of relative ranking. A genetic algorithm is used to optimize the function against test data sets that include ligands for the following proteins: IMPDH, p38, gyrase B, HIV-1, and TACE, as well as the Syracuse Research solubility database. We find that the function is predictive, and can simultaneously fit all the test data sets with cross-validated predictive indices ranging from 0.68 to 0.82. As a test of the ability of this function to predict binding for systems not in the training set, the resulting fitted FURSAMA function is then applied to 23 ligands of the COX-2 enzyme. Comparing the results for COX-2 against those obtained using a variety of well-known rapid scoring functions demonstrates that FURSMASA outperforms all of them in terms of the PI and correlation coefficient. We also find that the FURSAMA function is able to reliably predict the water solubility for 1032 compounds from the Syracuse Research solubility database with a cross-correlated PI of 0.84 and a correlation coefficient R(2) of 0.69. This prediction, which is based solely on a term derived from the atom-based solvent-accessible surface areas, compares favorably with the best prediction methods in the literature, most of which are more complex and/or require experimental data. Finally, as a rigorous test of the applicability to database screening, we apply FURSMASA to large active/decoy ligand databases for IMPDH (400 actives vs. 10,000 decoys), p38 (502 actives vs. 10,000 decoys), and HIV (787 actives vs. 10,000 decoys) used in earlier work to critically evaluate many popular scoring functions, and find that FURSMASA performs surprisingly well for IMPDH and HIV.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1002/prot.21991 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!