Background: Single-nucleotide variants (SNVs) within gene coding sequences can significantly impact pre-mRNA splicing, bearing profound implications for pathogenic mechanisms and precision medicine. In this study, we aim to harness the well-established full-length gene splicing assay (FLGSA) in conjunction with SpliceAI to prospectively interpret the splicing effects of all potential coding SNVs within the four-exon SPINK1 gene, a gene associated with chronic pancreatitis.
Results: Our study began with a retrospective analysis of 27 SPINK1 coding SNVs previously assessed using FLGSA, proceeded with a prospective analysis of 35 new FLGSA-tested SPINK1 coding SNVs, followed by data extrapolation, and ended with further validation.
Current software packages for the analysis and the simulations of rare variants are only available for binary and continuous traits. Ravages provides solutions in a single R package to perform rare variant association tests for multicategory, binary and continuous phenotypes, to simulate datasets under different scenarios and to compute statistical power. Association tests can be run in the whole genome thanks to C++ implementation of most of the functions, using either RAVA-FIRST, a recently developed strategy to filter and analyse genome-wide rare variants, or user-defined candidate regions.
View Article and Find Full Text PDFRare variant association tests (RVAT) have been developed to study the contribution of rare variants widely accessible through high-throughput sequencing technologies. RVAT require to aggregate rare variants in testing units and to filter variants to retain only the most likely causal ones. In the exome, genes are natural testing units and variants are usually filtered based on their functional consequences.
View Article and Find Full Text PDFNext-generation sequencing technologies have opened up the possibility to sequence large samples of cases and controls to test for association with rare variants. To limit cost and increase sample sizes, data from controls could be used in multiple studies and might thus be generated on different sequencing platforms. This could pose some problems of comparability between cases and controls due to batch effects that could be confounding factors, leading to false-positive association signals.
View Article and Find Full Text PDFObjective: The majority of patients with a familial cerebral small vessel disease (CSVD) referred for molecular screening do not show pathogenic variants in known genes. In this study, we aimed to identify novel CSVD causal genes.
Methods: We performed a gene-based collapsing test of rare protein-truncating variants identified in exome data of 258 unrelated CSVD patients of an ethnically matched control cohort and of 2 publicly available large-scale databases, gnomAD and TOPMed.
Rare genetic variants are expected to play an important role in disease and several statistical methods have been developed to test for disease association with rare variants, including variance-component tests. These tests however deal only with binary or continuous phenotypes and it is not possible to take advantage of a suspected heterogeneity between subgroups of patients. To address this issue, we extended the popular rare-variant association test SKAT to compare more than two groups of individuals.
View Article and Find Full Text PDFObesity is genetically heterogeneous with monogenic and complex polygenic forms. Using exome and targeted sequencing in 2,737 severely obese cases and 6,704 controls, we identified three genes (PHIP, DGKI, and ZMYM4) with an excess burden of very rare predicted deleterious variants in cases. In cells, we found that nuclear PHIP (pleckstrin homology domain interacting protein) directly enhances transcription of pro-opiomelanocortin (POMC), a neuropeptide that suppresses appetite.
View Article and Find Full Text PDFBackground: Systemic lupus erythematosus (SLE) is a rare immunological disorder and genetic factors are considered important in its causation. Monogenic lupus has been associated with around 30 genotypes in humans and 60 in mice, while genome-wide association studies have identified more than 90 risk loci. We aimed to analyse the contribution of rare and predicted pathogenic gene variants in a population of unselected cases of childhood-onset SLE.
View Article and Find Full Text PDFAn amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFGenetic association studies have provided new insights into the genetic variability of human complex traits with a focus mainly on continuous or binary traits. Methods have been proposed to take into account disease heterogeneity between subgroups of patients when studying common variants but none was specifically designed for rare variants. Because rare variants are expected to have stronger effects and to be more heterogeneously distributed among cases than common ones, subgroup analyses might be particularly attractive in this context.
View Article and Find Full Text PDFBody-fat distribution is a risk factor for adverse cardiovascular health consequences. We analyzed the association of body-fat distribution, assessed by waist-to-hip ratio adjusted for body mass index, with 228,985 predicted coding and splice site variants available on exome arrays in up to 344,369 individuals from five major ancestries (discovery) and 132,177 European-ancestry individuals (validation). We identified 15 common (minor allele frequency, MAF ≥5%) and nine low-frequency or rare (MAF <5%) coding novel variants.
View Article and Find Full Text PDFIn the version of this article originally published, one of the two authors with the name Wei Zhao was omitted from the author list and the affiliations for both authors were assigned to the single Wei Zhao in the author list. In addition, the ORCID for Wei Zhao (Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA) was incorrectly assigned to author Wei Zhou. The errors have been corrected in the HTML and PDF versions of the article.
View Article and Find Full Text PDFIn the published version of this paper, the name of author Emanuele Di Angelantonio was misspelled. This error has now been corrected in the HTML and PDF versions of the article.
View Article and Find Full Text PDFGenome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, noncoding variants from which pinpointing causal genes remains challenging. Here we combined data from 718,734 individuals to discover rare and low-frequency (minor allele frequency (MAF) < 5%) coding variants associated with BMI.
View Article and Find Full Text PDFGlycaemic traits such as fasting and post-challenge glucose and insulin measures, as well as glycated haemoglobin (HbA1c), are used to diagnose and monitor diabetes. These traits are risk factors for cardiovascular disease even below the diabetic threshold, and their study can additionally yield insights into the pathophysiology of type 2 diabetes. To date, a diverse set of genetic approaches have led to the discovery of over 97 loci influencing glycaemic traits.
View Article and Find Full Text PDFObesity is a genetically heterogeneous disorder. Using targeted and whole-exome sequencing, we studied 32 human and 87 rodent obesity genes in 2,548 severely obese children and 1,117 controls. We identified 52 variants contributing to obesity in 2% of cases including multiple novel variants in GNAS, which were sometimes found with accelerated growth rather than short stature as described previously.
View Article and Find Full Text PDFHeight is a highly heritable, classic polygenic trait with approximately 700 common associated variants identified through genome-wide association studies so far. Here, we report 83 height-associated coding variants with lower minor-allele frequencies (in the range of 0.1-4.
View Article and Find Full Text PDFRegulatory authorities have indicated that new drugs to treat type 2 diabetes (T2D) should not be associated with an unacceptable increase in cardiovascular risk. Human genetics may be able to guide development of antidiabetic therapies by predicting cardiovascular and other health endpoints. We therefore investigated the association of variants in six genes that encode drug targets for obesity or T2D with a range of metabolic traits in up to 11,806 individuals by targeted exome sequencing and follow-up in 39,979 individuals by targeted genotyping, with additional in silico follow-up in consortia.
View Article and Find Full Text PDFMosaic loss of chromosome Y (mLOY) leading to gonosomal XY/XO commonly occurs during aging, particularly in smokers. We investigated whether mLOY was associated with non-hematological cancer in three prospective cohorts (8,679 cancer cases and 5,110 cancer-free controls) and genetic susceptibility to mLOY. Overall, mLOY was observed in 7% of men, and its prevalence increased with age (per-year odds ratio (OR) = 1.
View Article and Find Full Text PDFThe continuous advancement in genotyping technology has not been accompanied by the application of innovative statistical methods, such as multi-marker methods (MMM), to unravel genetic associations with complex traits. Although the performance of MMM has been widely explored in a prediction context, little is known on their behavior in the quantitative trait loci (QTL) detection under complex genetic architectures. We shed light on this still open question by applying Bayes A (BA) and Bayesian LASSO (BL) to simulated and real data.
View Article and Find Full Text PDF