Background: Genotype imputation is a cost efficient alternative to use of high density genotypes for implementing genomic selection. The objective of this study was to investigate variables affecting imputation accuracy from low density tagSNP (average distance between tagSNP from 100kb to 1Mb) sets in swine, selected using LD information, physical location, or accuracy for genotype imputation. We compared results of imputation accuracy based on several sets of low density tagSNP of varying densities and selected using three different methods. In addition, we assessed the effect of varying size and composition of the reference panel of haplotypes used for imputation.

Results: TagSNP density of at least 1 tagSNP per 340kb (~7000 tagSNP) selected using pairwise LD information was necessary to achieve average imputation accuracy higher than 0.95. A commercial low density (9K) tagSNP set for swine was developed concurrent to this study and an average accuracy of imputation of 0.951 based on these tagSNP was estimated. Construction of a haplotype reference panel was most efficient when these haplotypes were obtained from randomly sampled individuals. Increasing the size of the original reference haplotype panel (128 haplotypes sampled from 32 sire/dam/offspring trios phased in a previous study) led to an overall increase in imputation accuracy (IA = 0.97 with 512 haplotypes), but was especially useful in increasing imputation accuracy of SNP with MAF below 0.1 and for SNP located in the chromosomal extremes (within 5% of chromosome end).

Conclusion: The new commercially available 9K tagSNP set can be used to obtain imputed genotypes with high accuracy, even when imputation is based on a comparably small panel of reference haplotypes (128 haplotypes). Average imputation accuracy can be further increased by adding haplotypes to the reference panel. In addition, our results show that randomly sampling individuals to genotype for the construction of a reference haplotype panel is more cost efficient than specifically sampling older animals or trios with no observed loss in imputation accuracy. We expect that the use of imputed genotypes in swine breeding will yield highly accurate predictions of GEBV, based on the observed accuracy and reported results in dairy cattle, where genomic evaluation of some individuals is based on genotypes imputed with the same accuracy as our Yorkshire population.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3734000PMC
http://dx.doi.org/10.1186/1471-2156-14-8DOI Listing

Publication Analysis

Top Keywords

imputation accuracy
32
density tagsnp
16
accuracy
13
imputation
12
low density
12
reference panel
12
tagsnp
9
variables imputation
8
genotype imputation
8
cost efficient
8

Similar Publications

Recently, deep latent variable models have made significant progress in dealing with missing data problems, benefiting from their ability to capture intricate and non-linear relationships within the data. In this work, we further investigate the potential of Variational Autoencoders (VAEs) in addressing the uncertainty associated with missing data via a multiple importance sampling strategy. We propose a Missing data Multiple Importance Sampling Variational Auto-Encoder (MMISVAE) method to effectively model incomplete data.

View Article and Find Full Text PDF

Genomic Landscape and Prediction of Udder Traits in Saanen Dairy Goats.

Animals (Basel)

January 2025

Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China.

Goats are essential to the dairy industry in Shaanxi, China, with udder traits playing a critical role in determining milk production and economic value for breeding programs. However, the direct measurement of these traits in dairy goats is challenging and resource-intensive. This study leveraged genotyping imputation to explore the genetic parameters and architecture of udder traits and assess the efficiency of genomic prediction methods.

View Article and Find Full Text PDF

Machine-learning-assisted Preoperative Prediction of Pediatric Appendicitis Severity.

J Pediatr Surg

January 2025

McGill University Faculty of Medicine and Health Sciences, Canada; Harvey E. Beardmore Division of Pediatric Surgery, The Montreal Children's Hospital, McGill University Health Centre, Montreal, Qc, Canada.

Purpose: This study evaluates the effectiveness of machine learning (ML) algorithms for improving the preoperative diagnosis of acute appendicitis in children, focusing on the accurate prediction of the severity of disease.

Methods: An anonymized clinical and operative dataset was retrieved from the medical records of children undergoing emergency appendectomy between 2014 and 2021. We developed an ML pipeline that pre-processed the dataset and developed algorithms to predict 5 appendicitis grades (1 - non-perforated, 2 - localized perforation, 3 - abscess, 4 - generalized peritonitis, and 5 - generalized peritonitis with abscess).

View Article and Find Full Text PDF

Single-cell RNA sequencing (scRNA-seq) is a cutting-edge technique in molecular biology and genomics, revealing the cellular heterogeneity. However, scRNA-seq data often suffer from dropout events, meaning that certain genes exhibit very low or even zero expression levels due to technical limitations. Existing imputation methods for dropout events lack comprehensive evaluations in downstream analyses and do not demonstrate robustness across various scenarios.

View Article and Find Full Text PDF

Using feedback in pooled experiments augmented with imputation for high genotyping accuracy at reduced cost.

G3 (Bethesda)

January 2025

Division of Scientific Computing, Department of Information Technolokgy, Uppsala University, SE-751 05 Uppsala, Sweden.

Conducting genomic selection in plant breeding programs can substantially speed up the development of new varieties. Genomic selection provides more reliable insights when it is based on dense marker data, in which the rare variants can be particularly informative. Despite the availability of new technologies, the cost of large-scale genotyping remains a major limitation to the implementation of genomic selection.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!