The Chemical Space of Terpenes: Insights from Data Science and AI.

Pharmaceuticals (Basel)

REQUIMTE/LAQV, Laboratório de Farmacognosia, Departamento de Química, Faculdade de Farmácia, Universidade do Porto, R. Jorge Viterbo Ferreira, 4050-313 Porto, Portugal.

Published: January 2023

Terpenes are a widespread class of natural products with significant chemical and biological diversity, and many of these molecules have already made their way into medicines. In this work, we employ a data science-based approach to identify, compile, and characterize the diversity of terpenes currently known in a systematic way, in a total of 59,833 molecules. We also employed several methods for the purpose of classifying terpene subclasses using their physicochemical descriptors. Light gradient boosting machine, k-nearest neighbours, random forests, Gaussian naïve Bayes and Multilayer perceptron were tested, with the best-performing algorithms yielding accuracy, F1 score, precision and other metrics all over 0.9, thus showing the capabilities of these approaches for the classification of terpene subclasses. These results can be important for the field of phytochemistry and pharmacognosy, as they allow the prediction of the subclass of novel terpene molecules, even when biosynthetic studies are not available.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9961535PMC
http://dx.doi.org/10.3390/ph16020202DOI Listing

Publication Analysis

Top Keywords

terpene subclasses
8
chemical space
4
space terpenes
4
terpenes insights
4
insights data
4
data science
4
science terpenes
4
terpenes widespread
4
widespread class
4
class natural
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!