We present a geometry-based interpretation of the f-statistics framework, commonly used in population genetics to estimate phylogenetic relationships from genomic data. The focus is on the determination of the mixing coefficients in population admixture events subject to post-admixture drift. The interpretation takes advantage of the high dimension of the dataset and analyzes the problem as a dimensional reduction issue. We show that it is possible to think of the f-statistics technique as an implicit transformation of the genomic data from a phase space into a subspace where the mapped data structure is more similar to the ancestral admixture configuration. The 2-way mixing coefficient is, as a matter of fact, carried out implicitly in this subspace. In addition, we propose the admixture test to be evaluated in the subspace because the comparison with the conventional one provides an important assessment of the admixture model. The overarching geometric framework provides slightly more general formulas than the f-formalism by using a different rationale as a starting point. Explicitly addressed are 2- and 3-way admixtures. The mixture proportions are provided by suitable linear fits, in 2 or 3 dimensions, that can be easily visualized. The difficulties encountered with introgression and gene flow are also addressed. The developments and findings are illustrated with numerical simulations and real-world cases.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639143PMC
http://dx.doi.org/10.1093/genetics/iyae134DOI Listing

Publication Analysis

Top Keywords

population genetics
8
genomic data
8
geometry admixture
4
admixture population
4
genetics blessing
4
blessing dimensionality
4
dimensionality geometry-based
4
geometry-based interpretation
4
interpretation f-statistics
4
f-statistics framework
4

Similar Publications

Marek's Disease (MD), which can result in neurological damage and tumour formation, has large effects on the economy and animal welfare of the poultry industry worldwide. Previously, we mapped autosomal MD QTL regions (QTLRs) by individual genotyping of an F population from a full-sib advanced intercross line. We further mapped MD QTLRs on the chicken Z chromosome (GGZ) using the same F population, and by selective DNA pooling (SDP) of 8 elite egg production lines.

View Article and Find Full Text PDF

A Multiplex High-Resolution Melting (HRM) assay to differentiate Fusarium graminearum chemotypes.

Sci Rep

December 2024

Cereal Disease Laboratory, Agricultural Research Service, US Department of Agriculture, St. Paul, MN, 55108, USA.

Fusarium graminearum is a primary cause of Fusarium head blight (FHB) on wheat and barley. The fungus produces trichothecene mycotoxins that render grain unsuitable for food, feed, or malt. Isolates of F.

View Article and Find Full Text PDF

Warfarin is the most widely used oral anticoagulant in clinical practice. The cytochrome P450 2C9 (CYP2C9), vitamin K epoxide reductase complex 1 (VKORC1), and cytochrome P450 4F2 (CYP4F2) genotypes are associated with warfarin dose requirements in China. Accurate genotyping is vital for obtaining reliable genotype-guided warfarin dosing information.

View Article and Find Full Text PDF

Genetic diversity can influence fitness components such as survival and reproductive success. Yet the association between genetic diversity and fitness based on neutral loci is sometime very weak and inconsistent, with relationships varying among taxa due to confounding effects of population demography and life history. Fitness-diversity relationships are likely to be stronger and more consistent for genes known to influence phenotypic traits, such as immunity-related genes, and may also depend on the genetic differences between breeding partners.

View Article and Find Full Text PDF

Quantifying the regulatory potential of genetic variants via a hybrid sequence-oriented model with SVEN.

Nat Commun

December 2024

State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Biomedical Pioneering Innovative Center (BIOPIC) and Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), Peking University, 100871, Beijing, China.

Deciphering how noncoding DNA determines gene expression is critical for decoding the functional genome. Understanding the transcription effects of noncoding genetic variants are still major unsolved problems, which is critical for downstream applications in human genetics and precision medicine. Here, we integrate regulatory-specific neural networks and tissue-specific gradient-boosting trees to build SVEN: a hybrid sequence-oriented architecture that can accurately predict tissue-specific gene expression level and quantify the tissue-specific transcriptomic impacts of structural variants across more than 350 tissues and cell lines.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!