Systematic identification of latent disease-gene associations from PubMed articles.

PLoS One

Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America.

Published: February 2018

AI Article Synopsis

  • Recent advances in biomedical science reveal complex connections between molecular and cellular processes and diseases, but extracting clear insights from large data sets remains a challenge due to noise and complexity.
  • This study presents a computational framework that analyzes over 146,000 disease-gene associations from more than 25 million PubMed articles to uncover hidden disease mechanisms.
  • Key findings indicate that the framework effectively categorizes diseases, reveals scale-free network properties in disease associations, identifies enriched patterns within these networks, and links genes to specific biological processes.

Article Abstract

Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5786305PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0191568PLOS

Publication Analysis

Top Keywords

latent disease-gene
8
disease-gene associations
8
associations pubmed
8
knowledge discovery
8
disease-specific association
8
association networks
8
latent
5
associations
5
systematic identification
4
identification latent
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!