Background: Very low-coverage (0.1 to 1×) whole genome sequencing (WGS) has become a promising and affordable approach to discover genomic variants of human populations for genome-wide association study (GWAS). To support genetic screening using preimplantation genetic testing (PGT) in a large population, the sequencing coverage goes below 0.1× to an ultra-low level. However, the feasibility and effectiveness of ultra-low-coverage WGS (ulcWGS) for GWAS remains undetermined.

Methods: We built a pipeline to carry out analysis of ulcWGS data for GWAS. To examine its effectiveness, we benchmarked the accuracy of genotype imputation at the combination of different coverages below 0.1× and sample sizes from 2000 to 16,000, using 17,844 embryo PGT samples with approximately 0.04× average coverage and the standard Chinese sample HG005 with known genotypes. We then applied the imputed genotypes of 1744 transferred embryos who have gestational ages and complete follow-up records to GWAS.

Results: The accuracy of genotype imputation under ultra-low coverage can be improved by increasing the sample size and applying a set of filters. From 1744 born embryos, we identified 11 genomic risk loci associated with gestational ages and 166 genes mapped to these loci according to positional, expression quantitative trait locus, and chromatin interaction strategies. Among these mapped genes, CRHBP, ICAM1, and OXTR were more frequently reported as preterm birth related. By joint analysis of gene expression data from previous studies, we constructed interrelationships of mainly CRHBP, ICAM1, PLAGL1, DNMT1, CNTLN, DKK1, and EGR2 with preterm birth, infant disease, and breast cancer.

Conclusions: This study not only demonstrates that ulcWGS could achieve relatively high accuracy of adequate genotype imputation and is capable of GWAS, but also provides insights into the associations between gestational age and genetic variations of the fetal embryos from Chinese population.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9926832PMC
http://dx.doi.org/10.1186/s13073-023-01158-7DOI Listing

Publication Analysis

Top Keywords

genotype imputation
12
genome-wide association
8
gestational age
8
17844 embryo
8
preimplantation genetic
8
genetic testing
8
accuracy genotype
8
gestational ages
8
crhbp icam1
8
preterm birth
8

Similar Publications

Improvement of the accuracy of breeding value prediction for egg production traits in Muscovy duck using low-coverage whole-genome sequence data.

Poult Sci

January 2025

Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, China; Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding and Key Lab of Chicken Genetics, Breeding and Reproduction, Ministry of Agriculture, Guangzhou, China. Electronic address:

Low-coverage whole genome sequencing (lcWGS) is an effective low-cost genotyping technology when combined with genotype imputation approaches. It facilitates cost-effective genomic selection (GS) programs in agricultural animal populations. GS based on lcWGS data has been successfully applied to livestock such as pigs and donkeys.

View Article and Find Full Text PDF

Transferability of Single- and Cross-Tissue Transcriptome Imputation Models Across Ancestry Groups.

Genet Epidemiol

January 2025

Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Division of Musculoskeletal and Dermatological Sciences, The University of Manchester, Manchester, UK.

Transcriptome-wide association studies (TWAS) investigate the links between genetically regulated gene expression and complex traits. TWAS involves imputing gene expression using expression quantitative trait loci (eQTL) as predictors and testing the association between the imputed expression and the trait. The effectiveness of TWAS depends on the accuracy of these imputation models, which require genotype and gene expression data from the same samples.

View Article and Find Full Text PDF

One of the major challenges in genomic data sharing is protecting participants' privacy in collaborative studies and when genomic data is outsourced to perform analysis tasks, e.g., genotype imputation services and federated collaborations genomic analysis.

View Article and Find Full Text PDF

Genomic selection is a widely used quantitative method of determining the genetic value of an individual from genomic information and phenotypic data. In this study, we used a large, multi-year training population of 3248 individuals from the University of Florida strawberry (Fragaria × ananassa Duchesne) breeding program. We coupled this training population with a test population of 1460 individuals derived from 20 biparental families.

View Article and Find Full Text PDF

Epidemiological and genetic factors affecting severe epizootic hemorrhagic disease in Spanish Holstein cattle during the Southern Europe outbreak of 2023.

J Dairy Sci

January 2025

Confederación de Asociaciones de Frisona Española (CONAFE), Ctra. de Andalucía km 23600 Valdemoro, 28340 Madrid, Spain.

Epizootic hemorrhagic disease (EHD) is a non-contagious viral infection that can cause important economic losses in dairy farms. This study aimed to identify epidemiological and genetic factors influencing the susceptibility and severity of EHD in Holstein dairy cattle during the 2023 outbreak in Spain. Data from 2852 animals in 7 affected farms from 5 Spanish provinces were used.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!