Introduction: Constructing an accurate and comprehensive knowledge graph of specific diseases is critical for practical clinical disease diagnosis and treatment, reasoning and decision support, rehabilitation, and health management. For knowledge graph construction tasks (such as named entity recognition, relation extraction), classical BERT-based methods require a large amount of training data to ensure model performance. However, real-world medical annotation data, especially disease-specific annotation samples, are very limited. In addition, existing models do not perform well in recognizing out-of-distribution entities and relations that are not seen in the training phase.
Method: In this study, we present a novel and practical pipeline for constructing a heart failure knowledge graph using large language models and medical expert refinement. We apply prompt engineering to the three phases of schema design: schema design, information extraction, and knowledge completion. The best performance is achieved by designing task-specific prompt templates combined with the TwoStepChat approach.
Results: Experiments on two datasets show that the TwoStepChat method outperforms the Vanillia prompt and outperforms the fine-tuned BERT-based baselines. Moreover, our method saves 65% of the time compared to manual annotation and is better suited to extract the out-of-distribution information in the real world.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11250484 | PMC |
http://dx.doi.org/10.3389/fncom.2024.1389475 | DOI Listing |
Brief Bioinform
November 2024
Suzhou Key Lab of Multi-modal Data Fusion and Intelligent Healthcare, No. 1188 Wuzhong Avenue, Wuzhong District Suzhou, Suzhou 215004, China.
The automatic and accurate extraction of diverse biomedical relations from literature constitutes the core elements of medical knowledge graphs, which are indispensable for healthcare artificial intelligence. Currently, fine-tuning through stacking various neural networks on pre-trained language models (PLMs) represents a common framework for end-to-end resolution of the biomedical relation extraction (RE) problem. Nevertheless, sequence-based PLMs, to a certain extent, fail to fully exploit the connections between semantics and the topological features formed by these connections.
View Article and Find Full Text PDFEntropy (Basel)
January 2025
School of Electronic and Information, Northwestern Polytechnical University, Xi'an 710129, China.
Artificial intelligence plays an indispensable role in improving productivity and promoting social development, and causal discovery is one of the extremely important research directions in this field. Acyclic directed graphs (DAGs) are the most commonly used tool in causal modeling because of their excellent interpretability and structural properties. However, in the face of insufficient data, the accuracy and efficiency of DAGs learning are greatly reduced, resulting in a false perception of causality.
View Article and Find Full Text PDFEntropy (Basel)
January 2025
National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu 610065, China.
Graph anomaly detection is crucial in many high-impact applications across diverse fields. In anomaly detection tasks, collecting plenty of annotated data tends to be costly and laborious. As a result, few-shot learning has been explored to address the issue by requiring only a few labeled samples to achieve good performance.
View Article and Find Full Text PDFF1000Res
January 2025
Department of Data Science, Prasanna School of Public Health, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India.
Introduction: Numerous studies have concluded that the functional ingredients benefit human health. Similarly, present times have seen exponential growth in functional food in bakery product segments like breads and biscuits. However, there is a lack of information on functional ingredients and their usefulness in developing functional bakery products.
View Article and Find Full Text PDFBMC Biol
January 2025
Research Office, City University of Hong Kong (Dongguan), Dongguan, 523000, China.
Background: Recent advancements in single-cell RNA sequencing have greatly expanded our knowledge of the heterogeneous nature of tissues. However, robust and accurate cell type annotation continues to be a major challenge, hindered by issues such as marker specificity, batch effects, and a lack of comprehensive spatial and interaction data. Traditional annotation methods often fail to adequately address the complexity of cellular interactions and gene regulatory networks.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!