Publications by Kevin Caye

Publications by authors named "Kevin Caye"

Page 1 of 1

Sparse latent factor regression models for genome-wide and epigenome-wide association studies.

Basile Jumentier Kevin Caye Barbara Heude Johanna Lepeule Olivier François

Stat Appl Genet Mol Biol

March 2022

Article Synopsis

The text discusses the challenges in linking phenotypes or exposures with genomic and epigenomic data due to factors like unobserved confounding, and introduces penalized latent factor regression models to address these issues.
These models use penalties to manage high-dimensional data and are shown to improve statistical performance, particularly in sparse latent factor regression compared to other methods.
The authors applied these models to studies on a flowering trait in plants and smoking status in pregnant women, achieving consistent results with previous findings and identifying new relevant genes.

View Article and Find Full Text PDF

LFMM 2: Fast and Accurate Inference of Gene-Environment Associations in Genome-Wide Studies.

Kevin Caye Basile Jumentier Johanna Lepeule Olivier François

Mol Biol Evol

April 2019

Gene-environment association (GEA) studies are essential to understand the past and ongoing adaptations of organisms to their environment, but those studies are complicated by confounding due to unobserved demographic factors. Although the confounding problem has recently received considerable attention, the proposed approaches do not scale with the high-dimensionality of genomic data. Here, we present a new estimation method for latent factor mixed models (LFMMs) implemented in an upgraded version of the corresponding computer program.

View Article and Find Full Text PDF

Naturalgwas: An R package for evaluating genomewide association methods with empirical data.

Olivier François Kevin Caye

Mol Ecol Resour

July 2018

Association studies of polygenic traits are notoriously difficult when those studies are conducted at large geographic scales. The difficulty arises as genotype frequencies often vary in geographic space and across distinct environments. Those large-scale variations are known to yield false positives in standard association testing approaches.

View Article and Find Full Text PDF

Identifying outlier loci in admixed and in continuous populations using ancestral population differentiation statistics.

Helena Martins Kevin Caye Keurcien Luu Michael G B Blum Olivier François

Mol Ecol

October 2016

Finding genetic signatures of local adaptation is of great interest for many population genetic studies. Common approaches to sorting selective loci from their genomic background focus on the extreme values of the fixation index, F , across loci. However, the computation of the fixation index becomes challenging when the population is genetically continuous, when predefining subpopulations is a difficult task, and in the presence of admixed individuals in the sample.

View Article and Find Full Text PDF

Controlling false discoveries in genome scans for selection.

Olivier François Helena Martins Kevin Caye Sean D Schoville

Mol Ecol

January 2016

Population differentiation (PD) and ecological association (EA) tests have recently emerged as prominent statistical methods to investigate signatures of local adaptation using population genomic data. Based on statistical models, these genomewide testing procedures have attracted considerable attention as tools to identify loci potentially targeted by natural selection. An important issue with PD and EA tests is that incorrect model specification can generate large numbers of false-positive associations.

View Article and Find Full Text PDF

TESS3: fast inference of spatial population structure and genome scans for selection.

Kevin Caye Timo M Deist Helena Martins Olivier Michel Olivier François

Mol Ecol Resour

March 2016

Geography and landscape are important determinants of genetic variation in natural populations, and several ancestry estimation methods have been proposed to investigate population structure using genetic and geographic data simultaneously. Those approaches are often based on computer-intensive stochastic simulations and do not scale with the dimensions of the data sets generated by high-throughput sequencing technologies. There is a growing demand for faster algorithms able to analyse genomewide patterns of population genetic variation in their geographic context.

View Article and Find Full Text PDF