The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5648141PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0186004PLOS

Publication Analysis

Top Keywords

gene interactions
16
interactions
9
data
9
consensus bayesian
8
bayesian networks
8
pubmed database
8
data repositories
8
gene expression
8
consensus networks
8
data mining
8

Similar Publications

Adeno-associated virus (AAV) is a versatile viral vector technology that can be engineered for specific functionality in vaccine and gene therapy applications. One of the major challenges in AAV production is the need for a GMP-ready platform-based approach to downstream processing, as this would lead to a standardized method for multiple products. Chromatography has huge potential in AAV purification, as it is a scalable method that would enable manufacturing to a high degree of purity, potency, and consistency.

View Article and Find Full Text PDF

Unlabelled: The concept of genome-microbiome interactions, in which the microenvironment determined by host genetic polymorphisms regulates the local microbiota, is important in the pathogenesis of human disease. In otolaryngology, the resident bacterial microbiota is reportedly altered in non-infectious ear diseases, such as otitis media pearls and exudative otitis media. We hypothesized that a single-nucleotide polymorphism in the ATP-binding cassette sub-family C member 11 () gene, which determines earwax properties, regulates the ear canal microbiota.

View Article and Find Full Text PDF

Unlabelled: Type IV pili (T4P) are important virulence factors that allow bacteria to adhere to and rapidly colonize their hosts. T4P are primarily composed of major pilins that undergo cycles of extension and retraction and minor pilins that initiate pilus assembly. Bacteriophages use T4P as receptors and exploit pilus dynamics to infect their hosts.

View Article and Find Full Text PDF

subsp. () possesses a -specific uter embrane rotein XAC1347 (OMP) that exerts a role in the expression of the type III secretion system for pathogenicity. In this study, we reported that OMP was required for salt stress tolerance and cell membrane integrity, as well as the expression of the genes for the production of extracellular polysaccharides.

View Article and Find Full Text PDF

Next-generation cancer phenomics by deployment of multiple molecular endophenotypes coupled with high-throughput analyses of gene expression offer veritable opportunities for triangulation of discovery findings in non-small cell lung cancer (NSCLC) research. This study reports differentially expressed genes in NSCLC using publicly available datasets (GSE18842 and GSE229253), uncovering 130 common genes that may potentially represent crucial molecular signatures of NSCLC. Additionally, network analyses by GeneMANIA and STRING revealed significant coexpression and interaction patterns among these genes, with four notable hub genes-, , and -identified as pivotal in NSCLC progression.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!