Background: It has been hypothesized that multivariate analysis and systematic detection of epistatic interactions between explanatory genotyping variables may help resolve the problem of "missing heritability" currently observed in genome-wide association studies (GWAS). However, even the simplest bivariate analysis is still held back by significant statistical and computational challenges that are often addressed by reducing the set of analysed markers. Theoretically, it has been shown that combinations of loci may exist that show weak or no effects individually, but show significant (even complete) explanatory power over phenotype when combined. Reducing the set of analysed SNPs before bivariate analysis could easily omit such critical loci.

Results: We have developed an exhaustive bivariate GWAS analysis methodology that yields a manageable subset of candidate marker pairs for subsequent analysis using other, often more computationally expensive techniques. Our model-free filtering approach is based on classification using ROC curve analysis, an alternative to much slower regression-based modelling techniques. Exhaustive analysis of studies containing approximately 450,000 SNPs and 5,000 samples requires only 2 hours using a desktop CPU or 13 minutes using a GPU (Graphics Processing Unit). We validate our methodology with analysis of simulated datasets as well as the seven Wellcome Trust Case-Control Consortium datasets that represent a wide range of real life GWAS challenges. We have identified SNP pairs that have considerably stronger association with disease than their individual component SNPs that often show negligible effect univariately. When compared against previously reported results in the literature, our methods re-detect most significant SNP-pairs and additionally detect many pairs absent from the literature that show strong association with disease. The high overlap suggests that our fast analysis could substitute for some slower alternatives.

Conclusions: We demonstrate that the proposed methodology is robust, fast and capable of exhaustive search for epistatic interactions using a standard desktop computer. First, our implementation is significantly faster than timings for comparable algorithms reported in the literature, especially as our method allows simultaneous use of multiple statistical filters with low computing time overhead. Second, for some diseases, we have identified hundreds of SNP pairs that pass formal multiple test (Bonferroni) correction and could form a rich source of hypotheses for follow-up analysis.

Availability: A web-based version of the software used for this analysis is available at http://bioinformatics.research.nicta.com.au/gwis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665501PMC
http://dx.doi.org/10.1186/1471-2164-14-S3-S10DOI Listing

Publication Analysis

Top Keywords

epistatic interactions
12
analysis
10
exhaustive search
8
search epistatic
8
bivariate analysis
8
reducing set
8
set analysed
8
snp pairs
8
association disease
8
reported literature
8

Similar Publications

Fusarium stalk rot is the main factor reducing the quality of maize grain and leads to significant yield losses, which that ranges from 20 to 100%, depending on the degree of infection and weather conditions. Understanding its genetic mechanism is key to improving grain quality and ultimate yield. An experiment with 26 doubled haploid (DH) lines of maize was conducted in the northern part of the Lower Silesia Province in Poland over a ten-year period (2013-2022).

View Article and Find Full Text PDF

Genome-wide association study of rice (Oryza sativa L.) inflorescence architecture.

Plant Sci

January 2025

Department of Cell & Molecular Biology, Faculty of Life Sciences & Biotechnology, Shahid Beheshti University, Tehran, Iran.

Rice yield strongly depends on panicle size and architecture but the genetics underlying these traits and their coordination with environmental cues through various signaling pathways have remained elusive. A genome-wide association study (GWAS) was performed to pinpoint the underlying genetic determinants for rice panicle architecture by analyzing 20 panicle-related traits using a data set consisting of 44,100 SNPs. We defined QTL windows around significant SNPs by the rate of LD decay for each chromosome and used these windows to identify putative candidate genes associated with the trait.

View Article and Find Full Text PDF

Genetic interaction network of quantitative trait genes for heading date in rice.

J Genet Genomics

January 2025

Shanghai Key Laboratory of Plant Molecular Sciences, Shanghai Engineering Research Center of Plant Germplasm Resources, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China. Electronic address:

Several quantitative trait genes (QTGs) related to rice heading date, a key factor for crop development and yield, have been identified, along with complex interactions among genes. However, a comprehensive genetic interaction network for these QTGs has not yet been established. In this study, we use 18K-rice lines to identify QTGs and their epistatic interactions affecting rice heading date.

View Article and Find Full Text PDF

Epistatic hotspots organize antibody fitness landscape and boost evolvability.

Proc Natl Acad Sci U S A

January 2025

Department of Physics and Astronomy, University of California, Los Angeles, CA 90095.

The course of evolution is strongly shaped by interaction between mutations. Such epistasis can yield rugged sequence-function maps and constrain the availability of adaptive paths. While theoretical intuition is often built on global statistics of large, homogeneous model landscapes, mutagenesis measurements necessarily probe a limited neighborhood of a reference genotype.

View Article and Find Full Text PDF
Article Synopsis
  • Optimizing enzyme thermostability is crucial for protein science and industry, but combining multiple mutations can lead to inactivation, making traditional methods slow and inefficient.
  • Researchers developed an AI-driven method to enhance enzyme thermostability by efficiently recombining beneficial single-point mutations, using data from various mutant groups.
  • After two design rounds, the study achieved 50 combinatorial mutants with 100% success, including one exceptional mutant that significantly increased melting temperature and half-life, while also revealing complex interactions (epistasis) among mutations.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!