Haplotype and population structure inference using neural networks in whole-genome sequencing data.

Genome Res

Department of Biology, Bioinformatics Center, University of Copenhagen, DK-2200 Copenhagen, Denmark.

Published: August 2022

Accurate inference of population structure is important in many studies of population genetics. Here we present HaploNet, a method for performing dimensionality reduction and clustering of genetic data. The method is based on local clustering of phased haplotypes using neural networks from whole-genome sequencing or dense genotype data. By using Gaussian mixtures in a variational autoencoder framework, we are able to learn a low-dimensional latent space in which we cluster haplotypes along the genome in a highly scalable manner. We show that we can use haplotype clusters in the latent space to infer global population structure using haplotype information by exploiting the generative properties of our framework. Based on fitted neural networks and their latent haplotype clusters, we can perform principal component analysis and estimate ancestry proportions based on a maximum likelihood framework. Using sequencing data from simulations and closely related human populations, we show that our approach is better at distinguishing closely related populations than standard admixture and principal component analysis software. We further show that HaploNet is fast and highly scalable by applying it to genotype array data of the UK Biobank.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9435741PMC
http://dx.doi.org/10.1101/gr.276813.122DOI Listing

Publication Analysis

Top Keywords

population structure
12
neural networks
12
networks whole-genome
8
whole-genome sequencing
8
sequencing data
8
latent space
8
highly scalable
8
haplotype clusters
8
principal component
8
component analysis
8

Similar Publications

Background: Lipoprotein(a) [Lp(a)] is associated with increased cardiovascular risk, but its influence on plaque characteristics at optical coherence tomography (OCT) evaluation is not fully understood.

Aims: This study seeks to explore the impact of Lp(a) levels on plaque morphology as assessed by OCT in a very high-risk subset of patients.

Methods: Consecutive patients admitted for acute coronary syndrome (ACS) and undergoing OCT-guided percutaneous coronary intervention (PCI) at a large tertiary care center between 2019 and 2022 were deemed eligible for the current analysis.

View Article and Find Full Text PDF

Background: The mechanisms underlying the resistance of the genus to aminoglycosides are complex, which poses a challenge for the efficient treatment of infectious diseases caused by these pathogens. To help clinicians treat infections more effectively, a more comprehensive understanding of antibiotic resistance mechanisms is urgently needed.

Methods: Plates were streaked to isolate bacteria from the intestinal contents of fish.

View Article and Find Full Text PDF

Introduction: In recent years there has been a notable expansion of psychotherapeutic approaches to treat people experiencing auditory verbal hallucinations (AVH). While many psychotherapists conceptualize voices as "dissociative parts" and apply therapeutic techniques derived from the field of dissociation, research investigating AVH from this perspective is limited. Despite the acknowledgment that voices encountered in dissociative identity disorder (DID) often exhibit high complexity and autonomy, there is a critical need for assessment tools capable of exploring voice complexity across different clinical groups.

View Article and Find Full Text PDF

Disordered single-stranded RNA (ssRNA) molecules, like their well-folded counterparts, have crucial functions that depend on their structures. However, since native ssRNAs constitute a highly heterogeneous conformer population, their structural characterization poses challenges. One important question regards the role of sequence in influencing ssRNA structure.

View Article and Find Full Text PDF

Introduction: Health literacy is an important predictor of health behavior and self-rated health, playing a crucial role in shaping public health outcomes. Valid and reliable health literacy assessments are essential for effectively tailoring health interventions, particularly in different cultural contexts. Several questionnaires have been developed to measure health literacy, including the widely used 47-item Health Literacy Questionnaire and its shorter versions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!