Fast admixture analysis and population tree estimation for SNP and NGS data.

Bioinformatics

Departments of Integrative Biology and Statistics, University of California, Berkeley, Berkeley, CA, USA.

Published: July 2017

AI Article Synopsis

  • The text discusses a new optimization algorithm for the STRUCTURE model, which is important in population genetics for classifying individuals into discrete ancestry components.
  • The algorithm reportedly achieves higher likelihood solutions in the same amount of computational time compared to existing methods and can handle data from Next Generation Sequencing (NGS), accounting for uncertainties in genotype calling.
  • The new methods, including a technique for estimating population trees using Gaussian approximations, are implemented in a software package called Ohana, which is accessible online along with user guides and workflows.

Article Abstract

Motivation: Structure methods are highly used population genetic methods for classifying individuals in a sample fractionally into discrete ancestry components.

Contribution: We introduce a new optimization algorithm for the classical STRUCTURE model in a maximum likelihood framework. Using analyses of real data we show that the new method finds solutions with higher likelihoods than the state-of-the-art method in the same computational time. The optimization algorithm is also applicable to models based on genotype likelihoods, that can account for the uncertainty in genotype-calling associated with Next Generation Sequencing (NGS) data. We also present a new method for estimating population trees from ancestry components using a Gaussian approximation. Using coalescence simulations of diverging populations, we explore the adequacy of the STRUCTURE-style models and the Gaussian assumption for identifying ancestry components correctly and for inferring the correct tree. In most cases, ancestry components are inferred correctly, although sample sizes and times since admixture can influence the results. We show that the popular Gaussian approximation tends to perform poorly under extreme divergence scenarios e.g. with very long branch lengths, but the topologies of the population trees are accurately inferred in all scenarios explored. The new methods are implemented together with appropriate visualization tools in the software package Ohana.

Availability And Implementation: Ohana is publicly available at https://github.com/jade-cheng/ohana . In addition to source code and installation instructions, we also provide example work-flows in the project wiki site.

Contact: jade.cheng@birc.au.dk.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6543773PMC
http://dx.doi.org/10.1093/bioinformatics/btx098DOI Listing

Publication Analysis

Top Keywords

ancestry components
12
ngs data
8
optimization algorithm
8
data method
8
population trees
8
gaussian approximation
8
fast admixture
4
admixture analysis
4
population
4
analysis population
4

Similar Publications

The Bipolar-Schizophrenia Network for Intermediate Phenotypes (B-SNIP) created psychosis Biotypes based on neurobiological measurements in a multi-ancestry sample. These Biotypes cut across DSM diagnoses of schizophrenia, schizoaffective disorder, and bipolar disorder with psychosis. Two recently developed post hoc ancestry adjustment methods of Polygenic Risk Scores (PRSs) generate Ancestry-Adjusted PRSs (AAPRSs), which allow for PRS analysis of multi-ancestry samples.

View Article and Find Full Text PDF

Objectives: This study aimed to determine the genetic and environmental contributions to phenotypic variations of palatal morphology during development.

Methods: Longitudinal three-dimensional digital maxillary dental casts of 228 twin pairs (104 monozygotic and 124 dizygotic) at primary, mixed, and permanent dentition stages were included in this study. Landmarks were placed on the casts along the midpoints of the dento-gingival junction on the palatal side of each tooth and the mid-palatine raphe using MeshLab.

View Article and Find Full Text PDF

Introduction: The profile of genetic and nongenetic factors associated with progression to kidney failure (KF) in steroid-resistant nephrotic syndrome (SRNS) is largely unknown in admixed populations.

Methods: A total of 101 pediatric patients with primary SRNS were genetically assessed targeting Mendelian causes and status with a 62-NS-gene panel or whole exome sequencing, as well as genetic ancestry. Variant pathogenicity was evaluated using the American College Medical of Genetics and Genomics (ACMG) criteria.

View Article and Find Full Text PDF

The genetic origins and impacts of historical Papuan migrations into Wallacea.

Proc Natl Acad Sci U S A

December 2024

Australian Centre for Ancient DNA, The Environment Institute, School of Biological Sciences, The University of Adelaide, Adelaide, SA 5005, Australia.

The tropical archipelago of Wallacea was first settled by anatomically modern humans (AMH) by 50 thousand years ago (kya), with descendent populations thought to have remained genetically isolated prior to the arrival of Austronesian seafarers around 3.5 kya. Modern Wallaceans exhibit a longitudinal countergradient of Papuan- and Asian-related ancestries widely considered as evidence for mixing between local populations and Austronesian seafarers, though converging multidisciplinary evidence suggests that the Papuan-related component instead comes primarily from back-migrations from New Guinea.

View Article and Find Full Text PDF

Background: Wheat landraces represent a reservoir of genetic diversity that can support wheat improvement through breeding. A core panel of 300 Watkins wheat landraces, as well as 16 non-Watkins landraces and elite wheat cultivars, was grown during the 2020-2021 and 2021-2022 seasons at four Agricultural Research Stations in Egypt, Gemmiza, Nubaria, Sakha, and Sids, to evaluate the core panel for agromorphological and yield-related traits. The genetic population structure within these genotypes were assessed using 35,143 single nucleotide polymorphisms (SNPs).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered

Severity: Notice

Message: fwrite(): Write of 34 bytes failed with errno=28 No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 272

Backtrace:

A PHP Error was encountered

Severity: Warning

Message: session_write_close(): Failed to write session data using user defined save handler. (session.save_path: /var/lib/php/sessions)

Filename: Unknown

Line Number: 0

Backtrace: