Model-based (likelihood and Bayesian) and non-model-based (PCA and K-means clustering) methods were developed to identify populations and assign individuals to the identified populations using marker genotype data. Model-based methods are favoured because they are based on a probabilistic model of population genetics with biologically meaningful parameters and thus produce results that are easily interpretable and applicable. Furthermore, they often yield more accurate structure inferences than non-model-based methods. However, current model-based methods either are computationally demanding and thus applicable to small problems only or use simplified admixture models that could yield inaccurate results in difficult situations such as unbalanced sampling. In this study, I propose new likelihood methods for fast and accurate population admixture inference using genotype data from a few multiallelic microsatellites to millions of diallelic SNPs. The methods conduct first a clustering analysis of coarse-grained population structure by using the mixture model and the simulated annealing algorithm, and then an admixture analysis of fine-grained population structure by using the clustering results as a starting point in an expectation maximisation algorithm. Extensive analyses of both simulated and empirical data show that the new methods compare favourably with existing methods in both accuracy and running speed. They can analyse small datasets with just a few multiallelic microsatellites but can also handle in parallel terabytes of data with millions of markers and millions of individuals. In difficult situations such as many and/or lowly differentiated populations, unbalanced or very small samples of individuals, the new methods are substantially more accurate than other methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9338324PMC
http://dx.doi.org/10.1038/s41437-022-00535-zDOI Listing

Publication Analysis

Top Keywords

genotype data
12
methods
10
fast accurate
8
accurate population
8
population admixture
8
admixture inference
8
inference genotype
8
microsatellites millions
8
model-based methods
8
difficult situations
8

Similar Publications

When haemoglobin genotyping was implemented in the early 1960s to investigate population genetic structure in Atlantic cod (), it became one of the first molecular genetic markers deployed in fisheries research worldwide. However, its suitability was questioned due to its potential for selection. While the issue of neutrality concerned the first population geneticists, markers under selection are now routinely used to study population genetic structure.

View Article and Find Full Text PDF

Microorganisms, crucial for environmental equilibrium, could be destructive, resulting in detrimental pathophysiology to the human host. Moreover, with the emergence of antibiotic resistance (ABR), the microbial communities pose the century's largest public health challenges in terms of effective treatment strategies. Furthermore, given the large diversity and number of known bacterial strains, describing treatment choices for infected patients using experimental methodologies is time-consuming.

View Article and Find Full Text PDF

The largest risk factor for dementia is age. Heterochronic blood exchange studies have uncovered age-related blood factors that demonstrate 'pro-aging' or 'pro-youthful' effects on the mouse brain. The clinical relevance and combined effects of these factors for humans is unclear.

View Article and Find Full Text PDF

Los olvidados: Non-BRCA variants associated with Hereditary breast cancer in Mexican population.

Breast Cancer Res

January 2025

Servicio de Oncología, Centro Universitario Contra el Cáncer (CUCC), Hospital Universitario "Dr. José Eleuterio González", Universidad Autónoma de Nuevo León, 66451, Monterrey, Nuevo León, México.

Background: Hereditary predisposition to breast and ovarian cancer syndrome (HBOC) is a pathological condition with increased cancer risk, including breast (BC), ovarian cancer (OC), and others. HBOC pathogenesis is caused mainly by germline pathogenic variants (GPV) in BRCA1 and BRCA2 genes. However, other relevant genes are related to this syndrome diagnosis, prognosis, and treatment, including TP53, PALB2, CHEK2, ATM, etc.

View Article and Find Full Text PDF

A rare haplotype of the GJD3 gene segregating in familial Meniere's disease interferes with connexin assembly.

Genome Med

January 2025

Otology & Neurotology Group CTS495, Instituto de Investigación Biosanitario, Ibs.GRANADA, Universidad de Granada, 18071, Granada, Spain.

Background: Familial Meniere's disease (FMD) is a rare polygenic disorder of the inner ear. Mutations in the connexin gene family, which encodes gap junction proteins, can also cause hearing loss, but their role in FMD is largely unknown.

Methods: We retrieved exome sequencing data from 94 individuals in 70 Meniere's disease (MD) families.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!