X-ray photoelectron spectroscopy (XPS) measures core-electron binding energies (CEBEs) to reveal element-specific insights into the chemical environment and bonding. Accurate theoretical CEBE prediction aids XPS interpretation but requires proper modeling of orbital relaxation and electron correlation upon core-ionization. This work systematically investigates basis set selection for extrapolation to the complete basis set limit of CEBEs from ΔMP2 and ΔCC energies across 94 -edges in diverse organic molecules.
View Article and Find Full Text PDFBiomolecular condensates help cells organise their content in space and time. Cells harbour a variety of condensate types with diverse composition and many are likely yet to be discovered. Here, we develop a methodology to predict the composition of biomolecular condensates.
View Article and Find Full Text PDFWe introduce the kernel-elastic autoencoder (KAE), a self-supervised generative model based on the transformer architecture with enhanced performance for molecular design. KAE employs two innovative loss functions: modified maximum mean discrepancy (m-MMD) and weighted reconstruction (). The m-MMD loss has significantly improved the generative performance of KAE when compared to using the traditional Kullback-Leibler loss of VAE, or standard maximum mean discrepancy.
View Article and Find Full Text PDFThe incredible capabilities of generative artificial intelligence models have inevitably led to their application in the domain of drug discovery. Within this domain, the vastness of chemical space motivates the development of more efficient methods for identifying regions with molecules that exhibit desired characteristics. In this work, we present a computationally efficient active learning methodology and demonstrate its applicability to targeted molecular generation.
View Article and Find Full Text PDFThe incredible capabilities of generative artificial intelligence models have inevitably led to their application in the domain of drug discovery. Within this domain, the vastness of chemical space motivates the development of more efficient methods for identifying regions with molecules that exhibit desired characteristics. In this work, we present a computationally efficient active learning methodology that requires evaluation of only a subset of the generated data in the constructed sample space to successfully align a generative model with respect to a specified objective.
View Article and Find Full Text PDF