Solid catalyst development has traditionally relied on trial-and-error approaches, limiting the broader application of valuable insights across different catalyst families. To overcome this fragmentation, we introduce a framework that integrates high-throughput experimentation (HTE) and automatic feature engineering (AFE) with active learning to acquire comprehensive catalyst knowledge. The framework is demonstrated for oxidative coupling of methane (OCM), where active learning is continued until the machine learning model achieves robustness for each of the BaO-, CaO-, LaO-, TiO-, and ZrO-supported catalysts, with 333 catalysts newly tested.
View Article and Find Full Text PDFThe empirical aspect of descriptor design in catalyst informatics, particularly when confronted with limited data, necessitates adequate prior knowledge for delving into unknown territories, thus presenting a logical contradiction. This study introduces a technique for automatic feature engineering (AFE) that works on small catalyst datasets, without reliance on specific assumptions or pre-existing knowledge about the target catalysis when designing descriptors and building machine-learning models. This technique generates numerous features through mathematical operations on general physicochemical features of catalytic components and extracts relevant features for the desired catalysis, essentially screening numerous hypotheses on a machine.
View Article and Find Full Text PDFDesigning high performance catalysts for the oxidative coupling of methane (OCM) reaction is often hindered by inconsistent catalyst data, which often leads to difficulties in extracting information such as combinatorial effects of elements upon catalyst performance as well as difficulties in reaching yields beyond a particular threshold. In order to investigate C yields more systematically, high throughput experiments are conducted in an effort to mass-produce catalyst-related data in a way that provides more consistency and structure. Graph theory is applied in order to visualize underlying trends in the transformation of high-throughput data into networks, which are then used to design new catalysts that potentially result in high C yields during the OCM reaction.
View Article and Find Full Text PDFIdentification of catalysts is a difficult matter as catalytic activities involve a vast number of complex features that each catalyst possesses. Here, catalysis gene expression profiling is proposed from unique features discovered in catalyst data collected by high-throughput experiments as an alternative way of representing the catalysts. Combining constructed catalyst gene sequences with hierarchical clustering results in catalyst gene expression profiling where natural language processing is used to identify similar catalysts based on edit distance.
View Article and Find Full Text PDF