The rapidly increasing and vast quantities of biomedical reports, each containing numerous entities and rich information, represent a rich resource for biomedical text-mining applications. These tools enable investigators to integrate, conceptualize, and translate these discoveries to uncover new insights into disease pathology and therapeutics. In this protocol, we present CaseOLAP LIFT, a new computational pipeline to investigate cellular components and their disease associations by extracting user-selected information from text datasets (e.g., biomedical literature). The software identifies sub-cellular proteins and their functional partners within disease-relevant documents. Additional disease-relevant documents are identified via the software's label imputation method. To contextualize the resulting protein-disease associations and to integrate information from multiple relevant biomedical resources, a knowledge graph is automatically constructed for further analyses. We present one use case with a corpus of ~34 million text documents downloaded online to provide an example of elucidating the role of mitochondrial proteins in distinct cardiovascular disease phenotypes using this method. Furthermore, a deep learning model was applied to the resulting knowledge graph to predict previously unreported relationships between proteins and disease, resulting in 1,583 associations with predicted probabilities >0.90 and with an area under the receiver operating characteristic curve (AUROC) of 0.91 on the test set. This software features a highly customizable and automated workflow, with a broad scope of raw data available for analysis; therefore, using this method, protein-disease associations can be identified with enhanced reliability within a text corpus.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3791/65084 | DOI Listing |
PLoS Comput Biol
January 2025
School of Mathematics/Harbin Institute of Technology, Harbin, China.
The rapid advance of large-scale atlas-level single cell RNA sequences and single-cell chromatin accessibility data provide extraordinary avenues to broad and deep insight into complex biological mechanism. Leveraging the datasets and transfering labels from scRNA-seq to scATAC-seq will empower the exploration of single-cell omics data. However, the current label transfer methods have limited performance, largely due to the lower capable of preserving fine-grained cell populations and intrinsic or extrinsic heterogeneity between datasets.
View Article and Find Full Text PDFPLoS One
January 2025
Automation School Guangdong University of Petrochemical Technology, Maoming, Guangdong, China.
Centrifugal compressors are widely used in the oil and natural gas industry for gas compression, reinjection, and transportation. Fault diagnosis and identification of centrifugal compressors are crucial. To promptly monitor abnormal changes in compressor data and trace the causes leading to these data anomalies, this paper proposes a security monitoring and root cause tracing method for compressor data anomalies.
View Article and Find Full Text PDFPLoS One
January 2025
School of Literature, Huaiyin Normal University, Huaian, China.
The fine-grained mining and construction of semantic associations within multimodal intangible cultural heritage (ICH) resources are crucial for deepening our understanding of their knowledge content and ensuring their systematic protection and transmission in the digital and intelligent era. This paper addresses the urgent need for the digital preservation and transmission of ICH resources. Following a review of current research on Qingyang sachets and ICH, the study introduces an ontology-based approach to constructing a semantic description model for the multimodal digital resources related to Qingyang sachets.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China.
Alzheimer's Disease (AD) significantly aggravates human dignity and quality of life. While newly approved amyloid immunotherapy has been reported, effective AD drugs remain to be identified. Here, we propose a novel AI-driven drug-repurposing method, DeepDrug, to identify a lead combination of approved drugs to treat AD patients.
View Article and Find Full Text PDFMethodsX
June 2025
Fakulti Teknologi Maklumat dan Komunikasi, Universiti Teknikal Malaysia Melaka, 76100 Melaka, Malaysia.
This study explores the possibility of integrating and retrieving heterogenous data across platforms by using ontology graph databases to enhance educational insights and enabling advanced data-driven decision-making. Motivated by some of the well-known universities and other Higher Education Institutions ontology, this study improvises the existing entities and introduces new entities in order to tackle a new topic identified from the preliminary interview conducted in the study to cover the study objective. The paper also proposes an innovative ontology, referred to as Student Performance and Course, to enhance resource management and evaluation mechanisms on course, students, and MOOC performance by the faculty.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!