Property prediction accuracy has long been a key parameter of machine learning in materials informatics. Accordingly, advanced models showing state-of-the-art performance turn into highly parameterized black boxes missing interpretability. Here, we present an elegant way to make their reasoning transparent. Human-readable text-based descriptions automatically generated within a suite of open-source tools are proposed as materials representation. Transformer language models pretrained on 2 million peer-reviewed articles take as input well-known terms such as chemical composition, crystal symmetry, and site geometry. Our approach outperforms crystal graph networks by classifying four out of five analyzed properties if one considers all available reference data. Moreover, fine-tuned text-based models show high accuracy in the ultra-small data limit. Explanations of their internal machinery are produced using local interpretability techniques and are faithful and consistent with domain expert rationales. This language-centric framework makes accurate property predictions accessible to people without artificial-intelligence expertise.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10591138 | PMC |
http://dx.doi.org/10.1016/j.patter.2023.100803 | DOI Listing |
Nucleic Acids Res
January 2025
London Institute for Mathematical Sciences Royal Institution, 21 Albemarle St, London W1S 4BS, UK.
Recent advancements in genomics, propelled by artificial intelligence, have unlocked unprecedented capabilities in interpreting genomic sequences, mitigating the need for exhaustive experimental analysis of complex, intertwined molecular processes inherent in DNA function. A significant challenge, however, resides in accurately decoding genomic sequences, which inherently involves comprehending rich contextual information dispersed across thousands of nucleotides. To address this need, we introduce GENA language model (GENA-LM), a suite of transformer-based foundational DNA language models capable of handling input lengths up to 36 000 base pairs.
View Article and Find Full Text PDFHealthc Technol Lett
January 2025
This study aimed to develop an advanced ensemble approach for automated classification of mental health disorders in social media posts. The research question was: can an ensemble of fine-tuned transformer models (XLNet, RoBERTa, and ELECTRA) with Bayesian hyperparameter optimization improve the accuracy of mental health disorder classification in social media text. Three transformer models (XLNet, RoBERTa, and ELECTRA) were fine-tuned on a dataset of social media posts labelled with 15 distinct mental health disorders.
View Article and Find Full Text PDFSisli Etfal Hastan Tip Bul
December 2024
Department of Endocrinology and Metabolic Diseases, Ankara Training and Research Hospital, Ankara, Türkiye.
Objectives: Type 2 diabetes mellitus is a disease with a rising prevalence worldwide. Person-centered treatment factors, including comorbidities and treatment goals, should be considered in determining the pharmacological treatment of type 2 diabetes. ChatGPT-4 (Generative Pre-trained Transformer), a large language model, holds the potential performance in various fields, including medicine.
View Article and Find Full Text PDFPhilos Trans A Math Phys Eng Sci
January 2025
Indian Institute of Technology Gandhinagar, Gandhinagar, Gujarat, India.
Modern language models such as bidirectional encoder representations from transformers have revolutionized natural language processing (NLP) tasks but are computationally intensive, limiting their deployment on edge devices. This paper presents an energy-efficient accelerator design tailored for encoder-based language models, enabling their integration into mobile and edge computing environments. A data-flow-aware hardware accelerator design for language models inspired by Simba, makes use of approximate fixed-point POSIT-based multipliers and uses high bandwidth memory (HBM) in achieving significant improvements in computational efficiency, power consumption, area and latency compared to the hardware-realized scalable accelerator Simba.
View Article and Find Full Text PDFPLOS Digit Health
January 2025
Rwanda Ministry of Health, Kigali, Rwanda.
Community isolation of patients with communicable infectious diseases limits spread of pathogens but our understanding of isolated patients' needs and challenges is incomplete. Rwanda deployed a digital health service nationally to assist public health clinicians to remotely monitor and support SARS-CoV-2 cases via their mobile phones using daily interactive short message service (SMS) check-ins. We aimed to assess the texting patterns and communicated topics to better understand patient experiences.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!