Deep phenotyping can enhance the power of genetic analysis, including genome-wide association studies (GWAS), but the occurrence of missing phenotypes compromises the potential of such resources. Although many phenotypic imputation methods have been developed, the accurate imputation of millions of individuals remains challenging. In the present study, we have developed a multi-phenotype imputation method based on mixed fast random forest (PIXANT) by leveraging efficient machine learning (ML)-based algorithms. We demonstrate by extensive simulations that PIXANT is reliable, robust and highly resource-efficient. We then apply PIXANT to the UKB data of 277,301 unrelated White British citizens and 425 traits, and GWAS is subsequently performed on the imputed phenotypes, 18.4% more GWAS loci are identified than before imputation (8710 vs 7355). The increased statistical power of GWAS identified some additional candidate genes affecting heart rate, such as RNF220, SCN10A, and RGS6, suggesting that the use of imputed phenotype data from a large cohort may lead to the discovery of additional candidate genes for complex traits.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1038/s41467-024-55496-0 | DOI Listing |
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11700122 | PMC |
Nat Commun
January 2025
Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs & Fisheries college, Jimei University, Xiamen, Fujian, People's Republic of China.
Deep phenotyping can enhance the power of genetic analysis, including genome-wide association studies (GWAS), but the occurrence of missing phenotypes compromises the potential of such resources. Although many phenotypic imputation methods have been developed, the accurate imputation of millions of individuals remains challenging. In the present study, we have developed a multi-phenotype imputation method based on mixed fast random forest (PIXANT) by leveraging efficient machine learning (ML)-based algorithms.
View Article and Find Full Text PDFmedRxiv
June 2023
Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD.
Glycated hemoglobin, fasting glucose, glycated albumin, and fructosamine are biomarkers that reflect different aspects of the glycemic process. Genetic studies of these glycemic biomarkers can shed light on unknown aspects of type 2 diabetes genetics and biology. While there exists several GWAS of glycated hemoglobin and fasting glucose, very few GWAS have focused on glycated albumin or fructosamine.
View Article and Find Full Text PDFGenomics
July 2021
DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa.
Despite decades of research and advancements in diagnostics and treatment, tuberculosis remains a major public health concern. New computational methods are needed to interrogate the intersection of host- and bacterial genomes. Paired host genotype datum and infecting bacterial isolate information were analysed for associations using a multinomial logistic regression framework implemented in SNPTest.
View Article and Find Full Text PDFPLoS One
January 2019
Génétique Animale et Biologie Intégrative, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France.
Using GWAS to identify candidate genes associated with cattle morphology traits at a functional level is challenging. The main difficulty of identifying candidate genes and gene interactions associated with such complex traits is the long-range linkage disequilibrium (LD) phenomenon reported widely in dairy cattle. Systems biology approaches, such as combining the Association Weight Matrix (AWM) with a Partial Correlation in an Information Theory (PCIT) algorithm, can assist in overcoming this LD.
View Article and Find Full Text PDFEur J Hum Genet
August 2017
Department of Genomics of Common Disease, Imperial College London, London, UK.
Genome-wide association studies have facilitated the discovery of thousands of loci for hundreds of phenotypes. However, the issue of missing heritability remains unsolved for most complex traits. Locus discovery could be enhanced with both improved power through multi-phenotype analysis (MPA) and use of a wider allele frequency range, including rare variants (RVs).
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!