Background: Female fertility is an important trait in dairy cattle. Identifying putative causal variants associated with fertility may help to improve the accuracy of genomic prediction of fertility. Combining expression data (eQTL) of genes, exons, gene splicing and allele specific expression is a promising approach to fine map QTL to get closer to the causal mutations.
View Article and Find Full Text PDFThe integration of genomic prediction with crop growth models enabled the estimation of missing environmental variables which improved the prediction accuracy of grain yield. Since the invention of whole-genome prediction (WGP) more than two decades ago, breeding programmes have established extensive reference populations that are cultivated under diverse environmental conditions. The introduction of the CGM-WGP model, which integrates crop growth models (CGM) with WGP, has expanded the applications of WGP to the prediction of unphenotyped traits in untested environments, including future climates.
View Article and Find Full Text PDFBackground: Meta-analysis describes a category of statistical methods that aim at combining the results of multiple studies to increase statistical power by exploiting summary statistics. Different industries that use genomic prediction do not share their raw data due to logistic or privacy restrictions, which can limit the size of their reference populations and creates a need for a practical meta-analysis method.
Results: We developed a meta-analysis, named MetaGS, that duplicates the results of multi-trait best linear unbiased prediction (mBLUP) analysis without accessing raw data.
Background: Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision.
Methods: The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.
Background: Sequence-based genome-wide association studies (GWAS) provide high statistical power to identify candidate causal mutations when a large number of individuals with both sequence variant genotypes and phenotypes is available. A meta-analysis combines summary statistics from multiple GWAS and increases the power to detect trait-associated variants without requiring access to data at the individual level of the GWAS mapping cohorts. Because linkage disequilibrium between adjacent markers is conserved only over short distances across breeds, a multi-breed meta-analysis can improve mapping precision.
View Article and Find Full Text PDFBackground: Two distinct populations have been extensively studied in Atlantic cod (Gadus morhua L.): the Northeast Arctic cod (NEAC) population and the coastal cod (CC) population. The objectives of the current study were to identify genomic islands of divergence and to propose an approach to quantify the strength of selection pressures using whole-genome single nucleotide polymorphism (SNP) data.
View Article and Find Full Text PDFBackground: Topological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome.
View Article and Find Full Text PDFStature is affected by many polymorphisms of small effect in humans . In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs.
View Article and Find Full Text PDFBackground: The increasing availability of whole-genome sequence data is expected to increase the accuracy of genomic prediction. However, results from simulation studies and analysis of real data do not always show an increase in accuracy from sequence data compared to high-density (HD) single nucleotide polymorphism (SNP) chip genotypes. In addition, the sheer number of variants makes analysis of all variants and accurate estimation of all effects computationally challenging.
View Article and Find Full Text PDFBackground: The apparent effect of a single nucleotide polymorphism (SNP) on phenotype depends on the linkage disequilibrium (LD) between the SNP and a quantitative trait locus (QTL). However, the phase of LD between a SNP and a QTL may differ between Bos indicus and Bos taurus because they diverged at least one hundred thousand years ago. Here, we test the hypothesis that the apparent effect of a SNP on a quantitative trait depends on whether the SNP allele is inherited from a Bos taurus or Bos indicus ancestor.
View Article and Find Full Text PDFWhole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present.
View Article and Find Full Text PDFThree recent breakthroughs have resulted in the current widespread use of DNA information: the genomic selection (GS) methodology, which is a form of marker-assisted selection on a genome-wide scale, and the discovery of large numbers of single-nucleotide markers and cost effective methods to genotype them. GS estimates the effect of thousands of DNA markers simultaneously. Nonlinear estimation methods yield higher accuracy, especially for traits with major genes.
View Article and Find Full Text PDFThe genetic architecture of complex traits in cattle includes very large numbers of loci affecting any given trait. Most of these loci have small effects but occasionally there are loci with moderate-to-large effects segregating due to recent selection for the mutant allele. Genomic markers capture most but not all of the additive genetic variance for traits, probably because there are causal mutations with low allele frequency and therefore in incomplete linkage disequilibrium with the markers.
View Article and Find Full Text PDFRelated individuals share potentially long chromosome segments that trace to a common ancestor. We describe a phasing algorithm (ChromoPhase) that utilizes this characteristic of finite populations to phase large sections of a chromosome. In addition to phasing, our method imputes missing genotypes in individuals genotyped at lower marker density when more densely genotyped relatives are available.
View Article and Find Full Text PDFResults from genome-wide association studies in livestock, and humans, has lead to the conclusion that the effect of individual quantitative trait loci (QTL) on complex traits, such as yield, are likely to be small; therefore, a large number of QTL are necessary to explain genetic variation in these traits. Given this genetic architecture, gains from marker-assisted selection (MAS) programs using only a small number of DNA markers to trace a limited number of QTL is likely to be small. This has lead to the development of alternative technology for using the available dense single nucleotide polymorphism (SNP) information, called genomic selection.
View Article and Find Full Text PDFPrediction of genetic merit using dense SNP genotypes can be used for estimation of breeding values for selection of livestock, crops, and forage species; for prediction of disease risk; and for forensics. The accuracy of these genomic predictions depends in part on the genetic architecture of the trait, in particular number of loci affecting the trait and distribution of their effects. Here we investigate the difference among three traits in distribution of effects and the consequences for the accuracy of genomic predictions.
View Article and Find Full Text PDFThe increased transcription of the Cyp6g1 gene of Drosophila melanogaster, and consequent resistance to insecticides such as DDT, is a widely cited example of adaptation mediated by cis-regulatory change. A fragment of an Accord transposable element inserted upstream of the Cyp6g1 gene is causally associated with resistance and has spread to high frequencies in populations around the world since the 1940s. Here we report the existence of a natural allelic series at this locus of D.
View Article and Find Full Text PDFA novel method, called linkage disequilibrium multilocus iterative peeling (LDMIP), for the imputation of phase and missing genotypes is developed. LDMIP performs an iterative peeling step for every locus, which accounts for the family data, and uses a forward-backward algorithm to accumulate information across loci. Marker similarity between haplotype pairs is used to impute possible missing genotypes and phases, which relies on the linkage disequilibrium between closely linked markers.
View Article and Find Full Text PDFWhole-genome resequencing technology has improved rapidly during recent years and is expected to improve further such that the sequencing of an entire human genome sequence for $1000 is within reach. Our main aim here is to use whole-genome sequence data for the prediction of genetic values of individuals for complex traits and to explore the accuracy of such predictions. This is relevant for the fields of plant and animal breeding and, in human genetics, for the prediction of an individual's risk for complex diseases.
View Article and Find Full Text PDFWe used a least absolute shrinkage and selection operator (LASSO) approach to estimate marker effects for genomic selection. The least angle regression (LARS) algorithm and cross-validation were used to define the best subset of markers to include in the model. The LASSO-LARS approach was tested on two data sets: a simulated data set with 5865 individuals and 6000 Single Nucleotide Polymorphisms (SNPs); and a mouse data set with 1885 individuals genotyped for 10 656 SNPs and phenotyped for a number of quantitative traits.
View Article and Find Full Text PDFBackground: Two key findings from genomic selection experiments are 1) the reference population used must be very large to subsequently predict accurate genomic estimated breeding values (GEBV), and 2) prediction equations derived in one breed do not predict accurate GEBV when applied to other breeds. Both findings are a problem for breeds where the number of individuals in the reference population is limited. A multi-breed reference population is a potential solution, and here we investigate the accuracies of GEBV in Holstein dairy cattle and Jersey dairy cattle when the reference population is single breed or multi-breed.
View Article and Find Full Text PDFContinued production of food in areas predicted to be most affected by climate change, such as dairy farming regions of Australia, will be a major challenge in coming decades. Along with rising temperatures and water shortages, scarcity of inputs such as high energy feeds is predicted. With the motivation of selecting cattle adapted to these changing environments, we conducted a genome wide association study to detect DNA markers (single nucleotide polymorphisms) associated with the sensitivity of milk production to environmental conditions.
View Article and Find Full Text PDFBackground: Identifying recent positive selection signatures in domesticated animals could provide information on genome response to strong directional selection from domestication and artificial selection. With the completion of the cattle genome, private companies are now providing large numbers of polymorphic markers for probing variation in domestic cattle (Bos taurus). We analysed over 7,500 polymorphic single nucleotide polymorphisms (SNP) in beef (Angus) and dairy (Holstein) cattle and outgroup species Bison, Yak and Banteng in an indirect test of inbreeding and positive selection in Domestic cattle.
View Article and Find Full Text PDFBackground: If mutation within the coding region of the genome is largely not adaptive, the ratio of nonsynonymous (dN) to synonymous substitutions (dS) per site (dN/dS) should be approximately equal among closely related species. Furthermore, dN/dS in divergence between species should be equivalent to dN/dS in polymorphisms. This hypothesis is of particular interest in closely related members of the Bovini tribe, because domestication has promoted rapid phenotypic divergence through strong artificial selection of some species while others remain undomesticated.
View Article and Find Full Text PDF