One fundamental but understudied mechanism of gene regulation in disease is allele-specific expression (ASE), the preferential expression of one allele. We leveraged RNA-sequencing data from human brain to assess ASE in autism spectrum disorder (ASD). When ASE is observed in ASD, the allele with lower population frequency (minor allele) is preferentially more highly expressed than the major allele, opposite to the canonical pattern.
View Article and Find Full Text PDFOver the past few years, genome-wide association studies have identified many trait-associated loci that have different effects on females and males, which increased attention to the genetic architecture differences between the sexes. The between-sex differences in genetic architectures can cause a variety of phenomena such as differences in the effect sizes at trait-associated loci, differences in the magnitudes of polygenic background effects, and differences in the phenotypic variances. However, current association testing approaches for dealing with sex, such as including sex as a covariate, cannot fully account for these phenomena and can be suboptimal in statistical power.
View Article and Find Full Text PDFBackground: Epistasis and gene-environment interactions are known to contribute significantly to variation of complex phenotypes in model organisms. However, their identification in human association studies remains challenging for myriad reasons. In the case of epistatic interactions, the large number of potential interacting sets of genes presents computational, multiple hypothesis correction, and other statistical power issues.
View Article and Find Full Text PDFMotivation: There is recent interest in using gene expression data to contextualize findings from traditional genome-wide association studies (GWAS). Conditioned on a tissue, expression quantitative trait loci (eQTLs) are genetic variants associated with gene expression, and eGenes are genes whose expression levels are associated with genetic variants. eQTLs and eGenes provide great supporting evidence for GWAS hits and important insights into the regulatory pathways involved in many diseases.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
September 2017
Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited.
View Article and Find Full Text PDFEpilepsy has many causes and comorbidities affecting as many as 4% of people in their lifetime. Both idiopathic and symptomatic epilepsies are highly heritable, but genetic factors are difficult to characterize among humans due to complex disease etiologies. Rodent genetic studies have been critical to the discovery of seizure susceptibility loci, including mutations identified in both mouse and human cohorts.
View Article and Find Full Text PDFA typical genome-wide association study tests correlation between a single phenotype and each genotype one at a time. However, single-phenotype analysis might miss unmeasured aspects of complex biological networks. Analyzing many phenotypes simultaneously may increase the power to capture these unmeasured aspects and detect more variants.
View Article and Find Full Text PDFThe study of the genetics of gene expression is of considerable importance to understanding the nature of common, complex diseases. The most widely applied approach to identifying relationships between genetic variation and gene expression is the expression quantitative trait loci (eQTL) approach. Here, we increased the computational power of eQTL with an alternative and complementary approach based on analyzing allele specific expression (ASE).
View Article and Find Full Text PDFMalaria transmission is dependent on the propensity of Anopheles mosquitoes to bite humans (anthropophily) instead of other dead end hosts. Recent increases in the usage of Long Lasting Insecticide Treated Nets (LLINs) in Africa have been associated with reductions in highly anthropophilic and endophilic vectors such as Anopheles gambiae s.s.
View Article and Find Full Text PDFGenome-wide association studies (GWASs) have been successful in detecting variants correlated with phenotypes of clinical interest. However, the power to detect these variants depends on the number of individuals whose phenotypes are collected, and for phenotypes that are difficult to collect, the sample size might be insufficient to achieve the desired statistical power. The phenotype of interest is often difficult to collect, whereas surrogate phenotypes or related phenotypes are easier to collect and have already been collected in very large samples.
View Article and Find Full Text PDFMeta-analysis has become a popular tool for genetic association studies to combine different genetic studies. A key challenge in meta-analysis is heterogeneity, or the differences in effect sizes between studies. Heterogeneity complicates the interpretation of meta-analyses.
View Article and Find Full Text PDFGenetics provides a potentially powerful approach to dissect host-gut microbiota interactions. Toward this end, we profiled gut microbiota using 16s rRNA gene sequencing in a panel of 110 diverse inbred strains of mice. This panel has previously been studied for a wide range of metabolic traits and can be used for high-resolution association mapping.
View Article and Find Full Text PDFAlthough genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci.
View Article and Find Full Text PDFAge-related hearing loss (AHL) is characterized by a symmetric sensorineural hearing loss primarily in high frequencies and individuals have different levels of susceptibility to AHL. Heritability studies have shown that the sources of this variance are both genetic and environmental, with approximately half of the variance attributable to hereditary factors as reported by Huag and Tang (Eur Arch Otorhinolaryngol 267(8):1179-1191, 2010). Only a limited number of large-scale association studies for AHL have been undertaken in humans, to date.
View Article and Find Full Text PDFIdentifying environmentally-specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but under varying environmental conditions.
View Article and Find Full Text PDFMotivation: Recently, investigators have proposed state-of-the-art Identity-by-descent (IBD) mapping methods to detect IBD segments between purportedly unrelated individuals. The IBD information can then be used for association testing in genetic association studies. One approach for this IBD association testing strategy is to test for excessive IBD between pairs of cases ('pairwise method').
View Article and Find Full Text PDFApplications of linear mixed models (LMMs) to problems in genomics include phenotype prediction, correction for confounding in genome-wide association studies, estimation of narrow sense heritability, and testing sets of variants (e.g., rare variants) for association.
View Article and Find Full Text PDFMotivation: Approaches for testing sets of variants, such as a set of rare or common variants within a gene or pathway, for association with complex traits are important. In particular, set tests allow for aggregation of weak signal within a set, can capture interplay among variants and reduce the burden of multiple hypothesis testing. Until now, these approaches did not address confounding by family relatedness and population structure, a problem that is becoming more important as larger datasets are used to increase power.
View Article and Find Full Text PDFWe have developed an association-based approach using classical inbred strains of mice in which we correct for population structure, which is very extensive in mice, using an efficient mixed-model algorithm. Our approach includes inbred parental strains as well as recombinant inbred strains in order to capture loci with effect sizes typical of complex traits in mice (in the range of 5% of total trait variance). Over the last few years, we have typed the hybrid mouse diversity panel (HMDP) strains for a variety of clinical traits as well as intermediate phenotypes and have shown that the HMDP has sufficient power to map genes for highly complex traits with resolution that is in most cases less than a megabase.
View Article and Find Full Text PDFGenetic studies in mouse models have played an integral role in the discovery of the mechanisms underlying many human diseases. The primary mode of discovery has been the application of linkage analysis to mouse crosses. This approach results in high power to identify regions that affect traits, but in low resolution, making it difficult to identify the precise genomic location harboring the causal variant.
View Article and Find Full Text PDFThe genetics of phenotypic variation in inbred mice has for nearly a century provided a primary weapon in the medical research arsenal. A catalog of the genetic variation among inbred mouse strains, however, is required to enable powerful positional cloning and association techniques. A recent whole-genome resequencing study of 15 inbred mouse strains captured a significant fraction of the genetic variation among a limited number of strains, yet the common use of hundreds of inbred strains in medical research motivates the need for a high-density variation map of a larger set of strains.
View Article and Find Full Text PDFInference of biological networks from high-throughput data is a central problem in bioinformatics. Particularly powerful for network reconstruction is data collected by recent studies that contain both genetic variation information and gene expression profiles from genetically distinct strains of an organism. Various statistical approaches have been applied to these data to tease out the underlying biological networks that govern how individual genetic variation mediates gene expression and how genes regulate and interact with each other.
View Article and Find Full Text PDF