A database of 7.9 million compounds commercially available from 29 suppliers in 2008-2009 was assembled and analyzed. 5.2 million structures of this database were identified to be unique and were subjected to an assessment of physical and biological properties and estimation of molecular diversity. The rules of Lipinski and Veber were applied to the molecular weight, the calculated water/n-octanol partition coefficients (Clog P), the calculated aqueous solubility (log S), the numbers of hydrogen-bond donors and acceptors, and the calculated Caco-2 membrane permeability to identify the drug-like compounds, whereas the toxicity/reactivity filters were used to remove the structures with biologically undesired functional groups. This filtering resulted in 2.0 million (39%) structures perfectly suitable for high-throughput screening of biological activity. Modified filters applied to identify lead-like structures revealed that 16% of the unique compounds could be potential leads. Assessment of the biological activities, the analysis of diversity, and the sizes of exclusive sets of compounds are presented.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1021/ci900464s | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!