EFGs: A Complete and Accurate Implementation of Ertl's Functional Group Detection Algorithm in RDKit.

J Chem Inf Model

Biostatistics and Bioinformatics Unit, IMDEA Food, E28049 Madrid, Spain.

Published: January 2025

Functional groups are widely used in organic chemistry, because they provide a rationale to analyze physicochemical and reactivity properties. In medicinal chemistry, they are the basis for analyzing ligand-biomacromolecule interactions. Ertl's algorithm is an approach to extract functional groups in arbitrary organic molecules that does not depend on predefined libraries of functional groups. However, there is a lack of a complete and accurate implementation of Ertl's algorithm in the widely used RDKit cheminformatic toolkit. In this paper, a new RDKit/Python implementation of the algorithm is described, that is both accurate and complete. For a RDKit molecule, it provides (i) a PNG binary string with an image of the molecule with color-highlighted functional groups; (ii) a list of sets of atom indices (idx), each set corresponding to a functional group; (iii) a list of pseudo-SMILES canonicalized strings for the full functional groups; and (iv) a list of RDKit labeled mol objects, one for each full functional group. The code is freely available in https://github.com/bbu-imdea/efgs and is part of the RDKit Contrib directory (https://github.com/rdkit/rdkit/tree/master/Contrib/efgs).

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.4c02268DOI Listing

Publication Analysis

Top Keywords

functional groups
20
functional group
12
complete accurate
8
accurate implementation
8
implementation ertl's
8
functional
8
algorithm rdkit
8
ertl's algorithm
8
groups list
8
full functional
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!