Publications by authors named "Kevin P Greenman"

Machine learning sequence-function models for proteins could enable significant advances in protein engineering, especially when paired with state-of-the-art methods to select new sequences for property optimization and/or model improvement. Such methods (Bayesian optimization and active learning) require calibrated estimations of model uncertainty. While studies have benchmarked a variety of deep learning uncertainty quantification (UQ) methods on standard and molecular machine-learning datasets, it is not clear if these results extend to protein datasets.

View Article and Find Full Text PDF
Article Synopsis
  • Deep learning is now widely used for predicting molecular properties, creating a demand for user-friendly software that non-experts can use.
  • Directed message-passing neural networks (D-MPNNs), particularly via the Chemprop software, have shown strong performance in these prediction tasks and come with new features like multimolecule properties and advanced uncertainty quantification.
  • The latest version of Chemprop offers improved tools for training D-MPNN models, achieving top-tier results on various datasets in molecular property prediction.
View Article and Find Full Text PDF

A closed-loop, autonomous molecular discovery platform driven by integrated machine learning tools was developed to accelerate the design of molecules with desired properties. We demonstrated two case studies on dye-like molecules, targeting absorption wavelength, lipophilicity, and photooxidative stability. In the first study, the platform experimentally realized 294 unreported molecules across three automatic iterations of molecular design-make-test-analyze cycles while exploring the structure-function space of four rarely reported scaffolds.

View Article and Find Full Text PDF

Optical properties are central to molecular design for many applications, including solar cells and biomedical imaging. A variety of and statistical methods have been developed for their prediction, each with a trade-off between accuracy, generality, and cost. Existing theoretical methods such as time-dependent density functional theory (TD-DFT) are generalizable across chemical space because of their robust physics-based foundations but still exhibit random and systematic errors with respect to experiment despite their high computational cost.

View Article and Find Full Text PDF