Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations.
View Article and Find Full Text PDFGenome-Wide association studies (GWAS), based on testing one single nucleotide polymorphism (SNP) at a time, have revolutionized our understanding of the genetics of complex traits. In GWAS, there is a need to consider confounding effects such as due to population structure, and take groups of SNPs into account simultaneously due to the "polygenic" attribute of complex quantitative traits. In this paper, we propose a new approach SGL-LMM that puts together sparse group lasso (SGL) and linear mixed model (LMM) for multivariate associations of quantitative traits.
View Article and Find Full Text PDFAmong the various statistical methods for identifying gene⁻gene interactions in qualitative genome-wide association studies (GWAS), gene-based methods have recently grown in popularity because they confer advantages in both statistical power and biological interpretability. However, most of these methods make strong assumptions about the form of the relationship between traits and single-nucleotide polymorphisms, which result in limited statistical power. In this paper, we propose a gene-based method based on the distance correlation coefficient called gene-based gene-gene interaction via distance correlation coefficient (GBDcor).
View Article and Find Full Text PDFFatty acid desaturase (FADS) genes encode rate-limiting enzymes for the biosynthesis of omega-6 and omega-3 long-chain polyunsaturated fatty acids (LCPUFAs). This biosynthesis is essential for individuals subsisting on LCPUFA-poor diets (for example, plant-based). Positive selection on FADS genes has been reported in multiple populations, but its cause and pattern in Europeans remain unknown.
View Article and Find Full Text PDFEarlier this year, we described an analysis of mitochondrial DNA (mtDNA) variants in myalgic encephalomyelitis (ME)/chronic fatigue syndrome (CFS) patients and healthy controls. We reported that there was no significant association of haplogroups or singe nucleotide polymorphisms (SNPs) with disease status. Nevertheless, a commentary about our paper appeared (Finsterer and Zarrouk-Mahjoub.
View Article and Find Full Text PDFAutoimmune diseases often share common susceptibility genes. Most genetic variants associated with susceptibility to systemic lupus erythematosus are also associated with other autoimmune diseases. The X-linked variant rs2294020 is positioned in exon 7 of the CCDC22 gene.
View Article and Find Full Text PDFThe advent of next-generation sequencing technology has allowed the collection of vast amounts of genetic variation data. A recurring discovery from studying larger and larger samples of individuals had been the extreme, previously unexpected, excess of very rare genetic variants, which has been shown to be mostly due to the recent explosive growth of human populations. Here, we review recent literature that inferred recent changes in population size in different human populations and with different methodologies, with many pointing to recent explosive growth, especially in European populations for which more data has been available.
View Article and Find Full Text PDFCochin Jews form a small and unique community on the Malabar coast in southwest India. While the arrival time of any putative Jewish ancestors of the community has been speculated to have taken place as far back as biblical times (King Solomon's era), a Jewish community in the Malabar coast has been documented only since the 9th century CE. Here, we explore the genetic history of Cochin Jews by collecting and genotyping 21 community members and combining the data with that of 707 individuals from 72 other Indian, Jewish, and Pakistani populations, together with additional individuals from worldwide populations.
View Article and Find Full Text PDFLong chain polyunsaturated fatty acids (LCPUFA) are bioactive components of membrane phospholipids and serve as substrates for signaling molecules. LCPUFA can be obtained directly from animal foods or synthesized endogenously from 18 carbon precursors via the FADS2 coded enzyme. Vegans rely almost exclusively on endogenous synthesis to generate LCPUFA and we hypothesized that an adaptive genetic polymorphism would confer advantage.
View Article and Find Full Text PDFThe gradual accumulation of mutations by any of a number of mutational processes is a major driving force of divergence and evolution. Here, we investigate a potentially novel mutational process that is based on the activity of members of the AID/APOBEC family of deaminases. This gene family has been recently shown to introduce-in multiple types of cancer-enzyme-induced clusters of co-occurring somatic mutations caused by cytosine deamination.
View Article and Find Full Text PDFThe Bene Israel Jewish community from West India is a unique population whose history before the 18th century remains largely unknown. Bene Israel members consider themselves as descendants of Jews, yet the identity of Jewish ancestors and their arrival time to India are unknown, with speculations on arrival time varying between the 8th century BCE and the 6th century CE. Here, we characterize the genetic history of Bene Israel by collecting and genotyping 18 Bene Israel individuals.
View Article and Find Full Text PDFBackground: Mitochondrial dysfunction has been hypothesized to occur in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS), a disease characterized by fatigue, cognitive difficulties, pain, malaise, and exercise intolerance. We investigated whether haplogroup, single nucleotide polymorphisms (SNPs), or heteroplasmy of mitochondrial DNA (mtDNA) were associated with health status and/or symptoms.
Methods: Illumina sequencing of PCR-amplified mtDNA was performed to analyze sequence and extent of heteroplasmy of mtDNAs of 193 cases and 196 age- and gender-matched controls from DNA samples collected by the Chronic Fatigue Initiative.
An open question in the history of human migration is the identity of the earliest Eurasian populations that have left contemporary descendants. The Arabian Peninsula was the initial site of the out-of-Africa migrations that occurred between 125,000 and 60,000 yr ago, leading to the hypothesis that the first Eurasian populations were established on the Peninsula and that contemporary indigenous Arabs are direct descendants of these ancient peoples. To assess this hypothesis, we sequenced the entire genomes of 104 unrelated natives of the Arabian Peninsula at high coverage, including 56 of indigenous Arab ancestry.
View Article and Find Full Text PDFIn eutherian mammals, X-linked gene expression is normalized between XX females and XY males through the process of X chromosome inactivation (XCI). XCI results in silencing of transcription from one ChrX homolog per female cell. However, approximately 25% of human ChrX genes escape XCI to some extent and exhibit biallelic expression in females.
View Article and Find Full Text PDFThe site frequency spectrum (SFS) and other genetic summary statistics are at the heart of many population genetic studies. Previous studies have shown that human populations have undergone a recent epoch of fast growth in effective population size. These studies assumed that growth is exponential, and the ensuing models leave an excess amount of extremely rare variants.
View Article and Find Full Text PDFBackground: The composition of bacteria in and on the human body varies widely across human individuals, and has been associated with multiple health conditions. While microbial communities are influenced by environmental factors, some degree of genetic influence of the host on the microbiome is also expected. This study is part of an expanding effort to comprehensively profile the interactions between human genetic variation and the composition of this microbial ecosystem on a genome- and microbiome-wide scale.
View Article and Find Full Text PDFGenome-wide association studies (GWASs) have identified hundreds of susceptibility genes, including shared associations across clinically distinct autoimmune diseases. We performed an inverse χ(2) meta-analysis across ten pediatric-age-of-onset autoimmune diseases (pAIDs) in a case-control study including more than 6,035 cases and 10,718 shared population-based controls. We identified 27 genome-wide significant loci associated with one or more pAIDs, mapping to in silico-replicated autoimmune-associated genes (including IL2RA) and new candidate loci with established immunoregulatory functions such as ADGRL2, TENM3, ANKRD30A, ADCY7 and CD40LG.
View Article and Find Full Text PDFXWAS is a new software suite for the analysis of the X chromosome in association studies and similar genetic studies. The X chromosome plays an important role in human disease and traits of many species, especially those with sexually dimorphic characteristics. Special attention needs to be given to its analysis due to the unique inheritance pattern, which leads to analytical complications that have resulted in the majority of genome-wide association studies (GWAS) either not considering X or mishandling it with toolsets that had been designed for non-sex chromosomes.
View Article and Find Full Text PDFBackground: The X chromosome plays an important role in human diseases and traits. However, few X-linked associations have been reported in genome-wide association studies, partly due to analytical complications and low statistical power.
Results: In this study, we propose tests of X-linked association that capitalize on variance heterogeneity caused by various factors, predominantly the process of X-inactivation.
Whole genome analysis in large samples from a single population is needed to provide adequate power to assess relative strengths of natural selection across different functional components of the genome. In this study, we analyzed next-generation sequencing data from 962 European Americans, and found that as expected approximately 60% of the top 1% of positive selection signals lie in intergenic regions, 33% in intronic regions, and slightly over 1% in coding regions. Several detailed functional annotation categories in intergenic regions showed statistically significant enrichment in positively selected loci when compared to the null distribution of the genomic span of ENCODE categories.
View Article and Find Full Text PDFMany complex human diseases are highly sexually dimorphic, suggesting a potential contribution of the X chromosome to disease risk. However, the X chromosome has been neglected or incorrectly analyzed in most genome-wide association studies (GWAS). We present tailored analytical methods and software that facilitate X-wide association studies (XWAS), which we further applied to reanalyze data from 16 GWAS of different autoimmune and related diseases (AID).
View Article and Find Full Text PDFThe age of polymorphic alleles in humans is often estimated from population genetic patterns in extant human populations, such as allele frequencies, linkage disequilibrium, and rate of mutations. Ancient DNA can improve the accuracy of such estimates, as well as facilitate testing the validity of demographic models underlying many population genetic methods. Specifically, the presence of an allele in a genome derived from an ancient sample testifies that the allele is at least as old as that sample.
View Article and Find Full Text PDFWhile the importance of epistasis is well established, specific gene-gene interactions have rarely been identified in human genome-wide association studies (GWAS), mainly due to low power associated with such interaction tests. In this chapter, we integrate biological knowledge and human GWAS data to reveal epistatic interactions underlying quantitative lipid traits, which are major risk factors for coronary artery disease. To increase power to detect interactions, we only tested pairs of SNPs filtered by prior biological knowledge, including GWAS results, protein-protein interactions (PPIs), and pathway information.
View Article and Find Full Text PDF