A genetic ensemble approach for gene-gene interaction identification.

BMC Bioinformatics

School of Information Technologies, University of Sydney, NSW 2006, Australia.

Published: October 2010

Background: It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging.

Methods: In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA) and an ensemble of classifiers (called genetic ensemble). Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP) subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise double fault is designed to quantify the degree of complementarity.

Conclusions: Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR) and is slightly better than Polymorphism Interaction Analysis (PIA), which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of combining identification results from different algorithms.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2973963PMC
http://dx.doi.org/10.1186/1471-2105-11-524DOI Listing

Publication Analysis

Top Keywords

genetic ensemble
20
interaction identification
12
gene-environment interactions
12
ensemble algorithm
12
ensemble approach
8
gene-gene interaction
8
identification
8
complex diseases
8
gene-gene gene-environment
8
combining identification
8

Similar Publications

Robust discrimination between closely related species of salmon based on DNA fragments.

Anal Bioanal Chem

January 2025

Statistical Engineering Division, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD, 20899-8980, USA.

Closely related species of Salmonidae, including Pacific and Atlantic salmon, can be distinguished from one another based on nucleotide sequences from the cytochrome c oxidase sub-unit 1 mitochondrial gene (COI), using ensembles of fragments aligned to genetic barcodes that serve as digital proxies for the relevant species. This is accomplished by exploiting both the nucleotide sequences and their quality scores recorded in a FASTQ file obtained via Next Generation (NextGen) Sequencing of mitochondrial DNA extracted from Coho salmon caught with hook and line in the Gulf of Alaska. The alignment is done using MUSCLE (Muscle 5.

View Article and Find Full Text PDF

Epigenetics in Learning and Memory.

Subcell Biochem

January 2025

Faculty of Medicine and Faculty of Life Sciences, Institute of Biomedical Sciences (ICB), Universidad Andres Bello, Santiago, Chile.

In animals, memory formation and recall are essential for their survival and for adaptations to a complex and often dynamically changing environment. During memory formation, experiences prompt the activation of a selected and sparse population of cells (engram cells) that undergo persistent physical and/or chemical changes allowing long-term memory formation, which can last for decades. Over the past few decades, important progress has been made on elucidating signaling mechanisms by which synaptic transmission leads to the induction of activity-dependent gene regulation programs during the different phases of learning (acquisition, consolidation, and recall).

View Article and Find Full Text PDF

Single nucleotide polymorphisms (SNPs) represent the prevailing form of genetic variations observed in the human population. Such variations could alter the encoded enzymes' activities. CYP3A4/5 enzymes are involved in metabolizing drugs, notably antivirals against SARS-CoV-2.

View Article and Find Full Text PDF

The hippocampus (HPC) has emerged as a critical player in the control of food intake, beyond its well-known role in memory. While previous studies have primarily associated the HPC with food intake inhibition, recent research suggests a role in appetitive processes. Here we identified spatially distinct neuronal populations within the dorsal HPC (dHPC) that respond to either fats or sugars, potent natural reinforcers that contribute to obesity development.

View Article and Find Full Text PDF

Accurate energy demand forecasting is critical for efficient energy management and planning. Recent advancements in computing power and the availability of large datasets have fueled the development of machine learning models. However, selecting the most appropriate features to enhance prediction accuracy and robustness remains a key challenge.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!