Publications by authors named "Clemence Corminboeuf"

Article Synopsis
  • Large language models (LLMs) like GPT-J-6B, Llama-3.1-8B, and Mistral-7B can learn chemical properties effectively through fine-tuning without specialized features.
  • Fine-tuning these models often outperforms traditional machine learning methods in simple classification tasks, with potential success in more complex problems depending on dataset size and question type.
  • The ease of converting datasets for LLM training and the effectiveness of small datasets in generating predictive models suggest that LLMs could significantly streamline experimental processes in chemical research.
View Article and Find Full Text PDF

Chiral cyclopentadienyl (Cp) metal complexes are frequently used in asymmetric catalysis by virtue of their high reactivity and selectivity. Planar-chiral-only rhodium and iridium cyclopentadienyl complexes are particularly promising due to unrestricted chemical space for Cp ligand design while retaining structural simplicity. However, they are currently still niche because of a lack of efficient synthetic strategies that avoid lengthy chiral auxiliary routes or chiral preparatory HPLC resolution of the complexes.

View Article and Find Full Text PDF

Simulations of chemical reactivity in condensed phase systems represent an ongoing challenge in computational chemistry, where traditional quantum chemical approaches typically struggle with both the size of the system and the potential complexity of the reaction. Here, we introduce a workflow aimed at efficiently training neural network potentials (NNPs) to explore energy barriers in solution at the hybrid density functional theory level. The computational burden associated with training at the PBE0-D3(BJ) level is bypassed through the use of active and transfer learning techniques, whereas extensive sampling of the transition state region is accelerated by well-tempered metadynamics simulations using multiple time step integration.

View Article and Find Full Text PDF

Molecules with Hund's rule violations between low-lying singlet and triplet states may enable a new generation of fluorescent emitters. However, only a few classes of molecules are known with this property at the current time. Here, we use a high-throughput screening algorithm of the FORMED database to uncover a class of compounds where the first excited state violates Hund's rule.

View Article and Find Full Text PDF

Bayesian optimization (BO) is an efficient method for solving complex optimization problems, including those in chemical research, where it is gaining significant popularity. Although effective in guiding experimental design, BO does not account for experimentation costs: testing readily available reagents under different conditions could be more cost and time-effective than synthesizing or buying additional ones. To address this issue, we present cost-informed BO (CIBO), an approach tailored for the rational planning of chemical experimentation that prioritizes the most cost-effective experiments.

View Article and Find Full Text PDF

Molecules where the lowest excited singlet state is lower in energy than the lowest triplet are highly promising for a number of organic materials applications as efficiency limitations stemming from spin statistics are overcome. All molecules known to possess such singlet-triplet inversions exhibit a pattern of spatially alternating but nonoverlapping HOMO and LUMO orbitals, meaning the lowest excited states are of a local character. Here, we demonstrate that derivatives of the bicyclic hydrocarbon calicene exhibit Hund's rule violations in charge-transfer (CT) states between its rings.

View Article and Find Full Text PDF

Singlet fission has shown potential for boosting the efficiency of solar cells, but the scarcity of suitable molecular materials hinders its implementation. We introduce an uncertainty-controlled genetic algorithm (ucGA) based on ensemble machine learning predictions from different molecular representations that concurrently optimizes excited state energies, synthesizability, and exciton size for the discovery of singlet fission materials. The ucGA allows us to efficiently explore the chemical space spanned by the reFORMED fragment database, which consists of 45,000 cores and 5,000 substituents derived from crystallographic structures assembled in the FORMED repository.

View Article and Find Full Text PDF

Symmetry-adapted perturbation theory (SAPT) is a popular and versatile tool to compute and decompose noncovalent interaction energies between molecules. The intramolecular SAPT (ISAPT) variant provides a similar energy decomposition between two nonbonded fragments of the same molecule, covalently connected by a third fragment. In this work, we explore an alternative approach where the noncovalent interaction is singled out by a range separation of the Coulomb potential.

View Article and Find Full Text PDF

Exploiting crystallographic data repositories for large-scale quantum chemical computations requires the rapid and accurate extraction of the molecular structure, charge and spin from the crystallographic information file. Here, we develop a general approach to assign the ground state spin of transition metal complexes, in complement to our previous efforts on determining metal oxidation states and bond order within the software. Starting from a database of 31k transition metal complexes extracted from the Cambridge Structural Database with , we construct the TM-GSspin dataset, which contains 2063 mononuclear first row transition metal complexes and their computed ground state spins.

View Article and Find Full Text PDF

Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction data sets.

View Article and Find Full Text PDF

The prediction of reaction selectivity is a challenging task for computational chemistry, not only because many molecules adopt multiple conformations but also due to the exponential relationship between effective activation energies and rate constants. To account for molecular flexibility, an increasing number of methods exist that generate conformational ensembles of transition state (TS) structures. Typically, these TS ensembles are Boltzmann weighted and used to compute selectivity assuming Curtin-Hammett conditions.

View Article and Find Full Text PDF

Molecular volcano plots, which facilitate the rapid prediction of the activity and selectivity of prospective catalysts, have emerged as powerful tools for computational catalysis. Here, we integrate microkinetic modeling into the volcano plot framework to develop "microkinetic molecular volcano plots". The resulting unified computational framework allows the influence of important reaction parameters, including temperature, reaction time, and concentration, to be quickly incorporated and more complex situations, such as off-cycle resting states and coupled catalytic cycles, to be tackled.

View Article and Find Full Text PDF

Chiral ligands are important components in asymmetric homogeneous catalysis, but their synthesis and screening can be both time-consuming and resource-intensive. Data-driven approaches, in contrast to screening procedures based on intuition, have the potential to reduce the time and resources needed for reaction optimization by more rapidly identifying an ideal catalyst. These approaches, however, are often nontransferable and cannot be applied across different reactions.

View Article and Find Full Text PDF

Frustrated Lewis pairs (FLPs), featuring reactive combinations of Lewis acids and Lewis bases, have been utilized for myriad metal-free homogeneous catalytic processes. Immobilizing the active Lewis sites to a solid support, especially to porous scaffolds, has shown great potential to ameliorate FLP catalysis by circumventing some of its inherent drawbacks, such as poor product separation and catalyst recyclability. Nevertheless, designing immobilized Lewis pair active sites (LPASs) is challenging due to the requirement of placing the donor and acceptor centers in appropriate geometric arrangements while maintaining the necessary chemical environment to perform catalysis, and clear design rules have not yet been established.

View Article and Find Full Text PDF

In recent years, there has been a surge of interest in predicting computed activation barriers, to enable the acceleration of the automated exploration of reaction networks. Consequently, various predictive approaches have emerged, ranging from graph-based models to methods based on the three-dimensional structure of reactants and products. In tandem, many representations have been developed to predict experimental targets, which may hold promise for barrier prediction as well.

View Article and Find Full Text PDF

A catalyst possessing a broad substrate scope, in terms of both turnover and enantioselectivity, is sometimes called "general". Despite their great utility in asymmetric synthesis, truly general catalysts are difficult or expensive to discover traditional high-throughput screening and are, therefore, rare. Existing computational tools accelerate the evaluation of reaction conditions from a pre-defined set of experiments to identify the most general ones, but cannot generate entirely new catalysts with enhanced substrate breadth.

View Article and Find Full Text PDF

Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond "simple" drug-like compounds or molecules composed of well-defined building blocks (e.g.

View Article and Find Full Text PDF

Inverted singlet-triplet gaps may lead to novel molecular emitters if a rational design approach can be achieved. We uncover a substituent strategy that enables tuning of the gap and succeed in inducing inversion in near-gapless molecules. Based on known inverted-gap emitters, we design substituted analogs with even more negative singlet-triplet gaps than in the parent systems.

View Article and Find Full Text PDF

Recently, we introduced a class of molecular representations for kernel-based regression methods─the spectrum of approximated Hamiltonian matrices (SPAM)─that takes advantage of lightweight one-electron Hamiltonians traditionally used as a self-consistent field initial guess. The original SPAM variant is built from occupied-orbital energies (i.e.

View Article and Find Full Text PDF

In this account, we discuss the use of genetic algorithms in the inverse design process of homogeneous catalysts for chemical transformations. We describe the main components of evolutionary experiments, specifically the nature of the fitness function to optimize, the library of molecular fragments from which potential catalysts are assembled, and the settings of the genetic algorithm itself. While not exhaustive, this review summarizes the key challenges and characteristics of our own (i.

View Article and Find Full Text PDF

In this minireview, we overview a computational pipeline developed within the framework of NCCR Catalysis that can be used to successfully reproduce the enantiomeric ratios of homogeneous catalytic reactions. At the core of this pipeline is the SCINE Molassembler module, a graph-based software that provides algorithms for molecular construction of all periodic table elements. With this pipeline, we are able to simultaneously functionalizenand generate ensembles of transition state conformers, which permits facile exploration of the influencenof various substituents on the overall enantiomeric ratio.

View Article and Find Full Text PDF

The high-throughput exploration and screening of molecules for organic electronics involves either a 'top-down' curation and mining of existing repositories, or a 'bottom-up' assembly of user-defined fragments based on known synthetic templates. Both are time-consuming approaches requiring significant resources to compute electronic properties accurately. Here, 'top-down' is combined with 'bottom-up' through automatic assembly and statistical models, thus providing a platform for the fragment-based discovery of organic electronic materials.

View Article and Find Full Text PDF

Molecules where the first excited singlet state is lower in energy than the first excited triplet state have the potential to revolutionize OLEDs. This inverted singlet-triplet gap violates Hund's rule and currently there are only a few molecules which are known to have this property. Here, we screen the complete set of non-alternant hydrocarbons consisting of 5-, 6-, 7-membered rings fused into two-, three- and four-ring polycyclic systems.

View Article and Find Full Text PDF

Electrohelicity arises in molecules such as allene and spiropentadiene when their symmetry is reduced and helical frontier molecular orbitals (MOs) appear. Such molecules are optically active and electrohelicity has been suggested as a possible design principle for increasing the chiroptical response. Here we examine the fundamental link between electrohelicity and optical activity by studying the origin of the electric and magnetic transition dipole moments of the π-π* transitions.

View Article and Find Full Text PDF

The stepwise catalytic reduction of carbon dioxide (CO) to formic acid, formaldehyde, and methanol opens non-fossil pathways to important platform chemicals. The present article aims at identifying molecular control parameters to steer the selectivity to the three distinct reduction levels using organometallic catalysts of earth-abundant first-row metals. A linear scaling relationship was developed to map the intrinsic reactivity of 3d transition metal pincer complexes to their activity and selectivity in CO hydrosilylation.

View Article and Find Full Text PDF