In this study, we present a question answering (QA) system for chemistry, named Marie, with the use of a text-to-text pretrained language model to attain accurate data retrieval. The underlying data store is "The World Avatar" (TWA), a general world model consisting of a knowledge graph that evolves over time. TWA includes information about chemical species such as their chemical and physical properties, applications, and chemical classifications. Building upon our previous work on KGQA for chemistry, this advanced version of Marie leverages a fine-tuned Flan-T5 model to seamlessly translate natural language questions into SPARQL queries with no separate components for entity and relation linking. The developed QA system demonstrates competence in providing accurate results for complex queries that involve many relation hops as well as showcasing the ability to balance correctness and speed for real-world usage. This new approach offers significant advantages over the prior implementation that relied on knowledge graph embedding. Specifically, the updated system boasts high accuracy and great flexibility in accommodating changes and evolution of the data stored in the knowledge graph without necessitating retraining. Our evaluation results underscore the efficacy of the improved system, highlighting its superior accuracy and the ability in answering complex questions compared to its predecessor.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10976360PMC
http://dx.doi.org/10.1021/acsomega.3c08842DOI Listing

Publication Analysis

Top Keywords

knowledge graph
12
text-to-text pretrained
8
pretrained language
8
question answering
8
leveraging text-to-text
4
language models
4
models question
4
answering chemistry
4
chemistry study
4
study question
4

Similar Publications

Multivariate time series anomaly detection (MTSAD) can effectively identify and analyze anomalous behavior in complex systems, which is particularly important in fields such as financial monitoring, industrial equipment fault detection, and cybersecurity. MTSAD requires simultaneously analyze temporal dependencies and inter-variable relationships have prompted researchers to develop specialized deep learning models to detect anomalous patterns. In this paper, we conducted a structured and comprehensive overview of the latest techniques in deep learning for multivariate time series anomaly detection methods.

View Article and Find Full Text PDF

Generation of Rational Drug-like Molecular Structures Through a Multiple-Objective Reinforcement Learning Framework.

Molecules

December 2024

Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, China.

As an appealing approach for discovering novel leads, the key advantage of de novo drug design lies in its ability to explore a much broader dimension of chemical space, without being confined to the knowledge of existing compounds. So far, many generative models have been described in the literature, which have completely redefined the concept of de novo drug design. However, many of them lack practical value for real-world drug discovery.

View Article and Find Full Text PDF

Uncertainty modeling for inductive knowledge graph embedding.

Neural Netw

January 2025

College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China; Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, 518060, China. Electronic address:

Article Synopsis
  • In refining Knowledge Graphs, new entities appear and old ones change, leading to a problem of distribution shift for entity features during representation learning.
  • Most current methods for embedding these graphs mainly focus on new entities and overlook the issues caused by this distribution shift.
  • The proposed model, EDSU, uses mean and variance reconstruction to address this shift by integrating both the characteristics of entity embeddings and their neighborhood structures, resulting in improved performance in inductive link prediction tasks compared to existing models.
View Article and Find Full Text PDF

The human brain connectome is characterized by the duality of highly modular structure and efficient integration, supporting information processing. Newborns with congenital heart disease (CHD), prematurity, or spina bifida aperta (SBA) constitute a population at risk for altered brain development and developmental delay (DD). We hypothesize that, independent of etiology, alterations of connectomic organization reflect neural circuitry impairments in cognitive DD.

View Article and Find Full Text PDF

The study of everyday life has garnered significant research attention in various disciplines. However, in the field of design history, the exploration of everyday life remains in its early stages. There is a need for further organization and analysis, as there is currently no comprehensive exposition on the overall research progress in this field.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!