Chemical language models (CLMs) can be employed to design molecules with desired properties. CLMs generate new chemical structures in the form of textual representations, such as the simplified molecular input line entry system (SMILES) strings. However, the quality of these de novo generated molecules is difficult to assess a priori. In this study, we apply the perplexity metric to determine the degree to which the molecules generated by a CLM match the desired design objectives. This model-intrinsic score allows identifying and ranking the most promising molecular designs based on the probabilities learned by the CLM. Using perplexity to compare "greedy" (beam search) with "explorative" (multinomial sampling) methods for SMILES generation, certain advantages of multinomial sampling become apparent. Additionally, perplexity scoring is performed to identify undesired model biases introduced during model training and allows the development of a new ranking system to remove those undesired biases.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924923 | PMC |
http://dx.doi.org/10.1021/acs.jcim.2c00079 | DOI Listing |
Med Chem
January 2025
School of Pharmaceutical Sciences, Lovely Professional University, Phagwara, Punjab -14440, India.
Background: Diabetes mellitus and obesity are two of the most frequent health conditions in the world, prompting medical researchers to seek novel effective treatments. According to World Health Organization (WHO) regulations and several research studies, diabetes is regarded as a significant and leading health concern worldwide. The search for efficient and safe antidiabetic drugs has led to the study of pyridine derivatives, a family of molecules with a wide range of pharmacological characteristics.
View Article and Find Full Text PDFMedicina (Kaunas)
January 2025
Neurology Department, Cooper University Hospital, Camden, NJ 08103, USA.
: Myoclonus is already associated with a wide variety of drugs and systemic conditions. As new components are discovered, more drugs are suspected of causing this disabling abnormal involuntary movement. This systematic review aims to assess the medications associated with drug-induced myoclonus (DIM).
View Article and Find Full Text PDFGenes (Basel)
December 2024
Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy.
Background/objectives: Artificial intelligence and large language models like ChatGPT and Google's Gemini are promising tools with remarkable potential to assist healthcare professionals. This study explores ChatGPT and Gemini's potential utility in assisting clinicians during the first evaluation of patients with suspected neurogenetic disorders.
Methods: By analyzing the model's performance in identifying relevant clinical features, suggesting differential diagnoses, and providing insights into possible genetic testing, this research seeks to determine whether these AI tools could serve as a valuable adjunct in neurogenetic assessments.
J Magn Reson Imaging
January 2025
Department of Biomedical Engineering, University of Alberta, Edmonton, Alberta, Canada.
Background: MRI offers quantification of proton density fat fraction (PDFF) and tissue characteristics with T1 mapping. The influence of age, sex, and the potential confounding effects of fat on T1 values in skeletal muscle in healthy adults are insufficiently known.
Purpose: To determine the accuracy and repeatability of a saturation-recovery chemical-shift encoded multiparametric approach (SR-CSE) for quantification of T1 and muscle fat content, and establish normative values (age, sex) from a healthy cohort.
Proc Natl Acad Sci U S A
January 2025
Laboratory for Atomistic and Molecular Mechanics, Massachusetts Institute of Technology, Cambridge, MA 02139.
The design of new alloys is a multiscale problem that requires a holistic approach that involves retrieving relevant knowledge, applying advanced computational methods, conducting experimental validations, and analyzing the results, a process that is typically slow and reserved for human experts. Machine learning can help accelerate this process, for instance, through the use of deep surrogate models that connect structural and chemical features to material properties, or vice versa. However, existing data-driven models often target specific material objectives, offering limited flexibility to integrate out-of-domain knowledge and cannot adapt to new, unforeseen challenges.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!