Background: Graph edit distance is a methodology used to solve error-tolerant graph matching. This methodology estimates a distance between two graphs by determining the minimum number of modifications required to transform one graph into the other. These modifications, known as edit operations, have an edit cost associated that has to be determined depending on the problem.

Objective: This study focuses on the use of optimization techniques in order to learn the edit costs used when comparing graphs by means of the graph edit distance.

Methods: Graphs represent reduced structural representations of molecules using pharmacophore-type node descriptions to encode the relevant molecular properties. This reduction technique is known as extended reduced graphs. The screening and statistical tools available on the ligand-based virtual screening benchmarking platform and the RDKit were used.

Results: In the experiments, the graph edit distance using learned costs performed better or equally good than using predefined costs. This is exemplified with six publicly available datasets: DUD-E, MUV, GLL&GDD, CAPST, NRLiSt BDB, and ULS-UDS.

Conclusion: This study shows that the graph edit distance along with learned edit costs is useful to identify bioactivity similarities in a structurally diverse group of molecules. Furthermore, the target-specific edit costs might provide useful structure-activity information for future drug-design efforts.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7536799PMC
http://dx.doi.org/10.2174/1568026620666200603122000DOI Listing

Publication Analysis

Top Keywords

graph edit
20
edit costs
16
edit distance
16
edit
10
ligand-based virtual
8
virtual screening
8
distance learned
8
graph
7
costs
6
distance
5

Similar Publications

TargetSA: adaptive simulated annealing for target-specific drug design.

Bioinformatics

December 2024

College of Computer Science, Sichuan University, Chengdu, Sichuan 610065, China.

Motivation: The burgeoning field of target-specific drug design has attracted considerable attention, focusing on identifying compounds with high binding affinity toward specific target pockets. Nevertheless, existing target-specific deep generative models encounter notable challenges. Some models heavily rely on elaborate datasets and complicated training methodologies, while others neglect the multi-constraint optimization problem inherent in drug design, resulting in generated molecules with irrational structures or chemical properties.

View Article and Find Full Text PDF

Evaluating Sequence Alignment Tools for Antimicrobial Resistance Gene Detection in Assembly Graphs.

Microorganisms

October 2024

Department of Mathematics and Computing Science, Saint Mary's University, Halifax, NS B3H 3C3, Canada.

Antimicrobial resistance (AMR) is an escalating global health threat, often driven by the horizontal gene transfer (HGT) of resistance genes. Detecting AMR genes and understanding their genomic context within bacterial populations is crucial for mitigating the spread of resistance. In this study, we evaluate the performance of three sequence alignment tools-Bandage, SPAligner, and GraphAligner-in identifying AMR gene sequences from assembly and de Bruijn graphs, which are commonly used in microbial genome assembly.

View Article and Find Full Text PDF

Affordable genotyping methods are essential in genomics. Commonly used genotyping methods primarily support single nucleotide variants and short indels but neglect structural variants. Additionally, accuracy of read alignments to a reference genome is unreliable in highly polymorphic and repetitive regions, further impacting genotyping performance.

View Article and Find Full Text PDF

Efficient indexing and querying of annotations in a pangenome graph.

bioRxiv

October 2024

IRSD - Digestive Health Research Institute, University of Toulouse, INSERM, INRAE, ENVT, UPS, Toulouse, France.

The current reference genome is the backbone of diverse and rich annotations. Simple text formats, like VCF or BED, have been widely adopted and helped the critical exchange of genomic information. There is a dire need for tools and formats enabling pangenomic annotation to facilitate such enrichment of pangenomic references.

View Article and Find Full Text PDF

Causal evidence for social group sizes from Wikipedia editing data.

R Soc Open Sci

October 2024

Department of Experimental Psychology, University of Oxford, Radcliffe Quarter, Oxford OX2 6GG, UK.

Human communities have self-organizing properties in which specific Dunbar Numbers may be invoked to explain group attachments. By analysing Wikipedia editing histories across a wide range of subject pages, we show that there is an emergent coherence in the size of transient groups formed to edit the content of subject texts, with two peaks averaging at around for the size corresponding to maximal contention, and at around as a regular team. These values are consistent with the observed sizes of conversational groups, as well as the hierarchical structuring of Dunbar graphs.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!