Feature selection is essential in the analysis of molecular systems and many other fields, but several uncertainties remain: What is the optimal number of features for a simplified, interpretable model that retains essential information? How should features with different units be aligned, and how should their relative importance be weighted? Here, we introduce the Differentiable Information Imbalance (DII), an automated method to rank information content between sets of features. Using distances in a ground truth feature space, DII identifies a low-dimensional subset of features that best preserves these relationships. Each feature is scaled by a weight, which is optimized by minimizing the DII through gradient descent.
View Article and Find Full Text PDFWater adsorption and dissociation processes on pristine low-index TiO interfaces are important but poorly understood outside the well-studied anatase (101) and rutile (110). To understand these, we construct three sets of machine learning potentials that are simultaneously applicable to various TiO surfaces, based on three density-functional-theory approximations. Here we show the water dissociation free energies on seven pristine TiO surfaces, and predict that anatase (100), anatase (110), rutile (001), and rutile (011) favor water dissociation, anatase (101) and rutile (100) have mostly molecular adsorption, while the simulations of rutile (110) sensitively depend on the slab thickness and molecular adsorption is preferred with thick slabs.
View Article and Find Full Text PDFPhysical catalysts often have multiple sites where reactions can take place. One prominent example is single-atom alloys, where the reactive dopant atoms can preferentially locate in the bulk or at different sites on the surface of the nanoparticle. However, ab initio modeling of catalysts usually only considers one site of the catalyst, neglecting the effects of multiple sites.
View Article and Find Full Text PDF