Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure-activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR transfer events for the same target or across different targets. We report a new computational approach to systematically search for SAR transfer series that combines an AS alignment algorithm with context-depending similarity assessment based on vector embeddings adapted from natural language processing. The methodology comprehensively accounts for substituent similarity, identifies non-classical bioisosteres, captures substituent-property relationships, and generates accurate AS alignments. Context-dependent similarity assessment is conceptually novel in computational medicinal chemistry and should also be of interest for other applications.Scientific contributionA method is reported to systematically search for and align analogue series with SAR transfer potential. Central to the approach is the assessment of context-dependent similarity for substituents, a new concept in cheminformatics, which is based upon vector embeddings and word pair relationships adapted from natural language processing.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734349 | PMC |
http://dx.doi.org/10.1186/s13321-025-00951-3 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!