When existing experimental data are combined with machine learning (ML) to predict the performance of new materials, the data acquisition bias determines ML usefulness and the prediction accuracy. In this context, the following two conditions are highly common: (i) constructing new unbiased data sets is too expensive and the global knowledge effectively does not change by performing a limited number of novel measurements; (ii) the performance of the material depends on a limited number of physical parameters, much smaller than the range of variables that can be changed, albeit such parameters are unknown or not measurable. To determine the usefulness of ML under these conditions, we introduce the concept of simulated research landscapes, which describe how datasets of arbitrary complexity evolve over time. Simulated research landscapes allow us to use different discovery strategies to compare standard materials exploration with ML-guided explorations, i.e. we can measure quantitatively the benefit of using a specific ML model. We show that there is a window of opportunity to obtain a significant benefit from ML-guided strategies. The adoption of ML can take place too soon (not enough information to find patterns) or too late (dense datasets only allow for negligible ML benefit), and the adoption of ML can even slow down the discovery process in some cases. We offer a qualitative guide on when ML can accelerate the discovery of new best-performing materials in a field under specific conditions. The answer in each case depends on factors like data dimensionality, corrugation and data collection strategy. We consider how these factors may affect the ML prediction capabilities and discuss some general trends.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1039/d1cp01761f | DOI Listing |
Biochem Biophys Res Commun
December 2024
Department of Applied Sciences, Indian Institute of Information of Technology Allahabad, Prayagraj, Uttar Pradesh, 211012, India. Electronic address:
Prostate cancer is a widespread health issue that affects men worldwide. It is one of the most common forms of cancer, and its development is influenced by a combination of hereditary, epigenetic, environmental, age, and lifestyle factors. Given that it is the second most common cause of cancer-related deaths in men, it is crucial to comprehend its complex facets.
View Article and Find Full Text PDFNat Commun
January 2025
Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
We have developed the regionalpcs method, an approach for summarizing gene-level methylation. regionalpcs addresses the challenge of deciphering complex epigenetic mechanisms in diseases like Alzheimer's disease. In contrast to averaging, regionalpcs uses principal components analysis to capture complex methylation patterns across gene regions.
View Article and Find Full Text PDFCell
December 2024
Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Chan Zuckerberg Biohub, San Francisco, CA 94148, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Anesthesia and Perioperative Care, University of California, San Francisco, San Francisco, CA 94115, USA. Electronic address:
Three proton-sensing G protein-coupled receptors (GPCRs)-GPR4, GPR65, and GPR68-respond to extracellular pH to regulate diverse physiology. How protons activate these receptors is poorly understood. We determined cryogenic-electron microscopy (cryo-EM) structures of each receptor to understand the spatial arrangement of proton-sensing residues.
View Article and Find Full Text PDFJ Food Drug Anal
December 2024
Institute of Food Science Technology, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei, Taiwan, R.O.C.
This study reveals the anti-tyrosinase activity of Ganoderma formosanum extracts, pinpointing compounds including gluconic acid, mesalamine, L-pyroglutamic acid, esculetin, 5-hydroxyindole, and salicylic acid, as effective melanin production inhibitors in melanoma cells and zebrafish embryos. Furthermore, multiple molecular docking simulations provided insights into interactions between the identified compounds and tyrosinase, increasing binding affinity up to -16.36 kcal/mol.
View Article and Find Full Text PDFPLoS One
January 2025
School of Mathematics & Statistic, Changchun University of Technology, Changchun, China.
Against the backdrop of an aging population, community pension initiatives are gaining traction, permeating societal landscapes. This study delves into the equilibrium strategy within the context of a defined benefit pension plan, employing a differential game framework with a community pension model. Hence, the model entails the company's controls over investment rates in funds, juxtaposed with employees' inclination towards a greater proportion of community pension allocation in said funds.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!