We present a geometry-based interpretation of the f-statistics framework, commonly used in population genetics to estimate phylogenetic relationships from genomic data. The focus is on the determination of the mixing coefficients in population admixture events subject to post-admixture drift. The interpretation takes advantage of the high dimension of the dataset and analyzes the problem as a dimensional reduction issue. We show that it is possible to think of the f-statistics technique as an implicit transformation of the genomic data from a phase space into a subspace where the mapped data structure is more similar to the ancestral admixture configuration. The 2-way mixing coefficient is, as a matter of fact, carried out implicitly in this subspace. In addition, we propose the admixture test to be evaluated in the subspace because the comparison with the conventional one provides an important assessment of the admixture model. The overarching geometric framework provides slightly more general formulas than the f-formalism by using a different rationale as a starting point. Explicitly addressed are 2- and 3-way admixtures. The mixture proportions are provided by suitable linear fits, in 2 or 3 dimensions, that can be easily visualized. The difficulties encountered with introgression and gene flow are also addressed. The developments and findings are illustrated with numerical simulations and real-world cases.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639143 | PMC |
http://dx.doi.org/10.1093/genetics/iyae134 | DOI Listing |
Sci Rep
December 2024
Hy-Line International, 2583 240th St, PO Box 310, Dallas Center, 50063, IA, USA.
Marek's Disease (MD), which can result in neurological damage and tumour formation, has large effects on the economy and animal welfare of the poultry industry worldwide. Previously, we mapped autosomal MD QTL regions (QTLRs) by individual genotyping of an F population from a full-sib advanced intercross line. We further mapped MD QTLRs on the chicken Z chromosome (GGZ) using the same F population, and by selective DNA pooling (SDP) of 8 elite egg production lines.
View Article and Find Full Text PDFSci Rep
December 2024
Cereal Disease Laboratory, Agricultural Research Service, US Department of Agriculture, St. Paul, MN, 55108, USA.
Fusarium graminearum is a primary cause of Fusarium head blight (FHB) on wheat and barley. The fungus produces trichothecene mycotoxins that render grain unsuitable for food, feed, or malt. Isolates of F.
View Article and Find Full Text PDFSci Rep
December 2024
Laboratory Medicine, First Affiliated Hospital of Gannan Medical University, Ganzhou, 341000, China.
Warfarin is the most widely used oral anticoagulant in clinical practice. The cytochrome P450 2C9 (CYP2C9), vitamin K epoxide reductase complex 1 (VKORC1), and cytochrome P450 4F2 (CYP4F2) genotypes are associated with warfarin dose requirements in China. Accurate genotyping is vital for obtaining reliable genotype-guided warfarin dosing information.
View Article and Find Full Text PDFSci Rep
December 2024
Department of Biology, Queen's University, Kingston, ON, Canada.
Genetic diversity can influence fitness components such as survival and reproductive success. Yet the association between genetic diversity and fitness based on neutral loci is sometime very weak and inconsistent, with relationships varying among taxa due to confounding effects of population demography and life history. Fitness-diversity relationships are likely to be stronger and more consistent for genes known to influence phenotypic traits, such as immunity-related genes, and may also depend on the genetic differences between breeding partners.
View Article and Find Full Text PDFNat Commun
December 2024
State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Biomedical Pioneering Innovative Center (BIOPIC) and Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), Peking University, 100871, Beijing, China.
Deciphering how noncoding DNA determines gene expression is critical for decoding the functional genome. Understanding the transcription effects of noncoding genetic variants are still major unsolved problems, which is critical for downstream applications in human genetics and precision medicine. Here, we integrate regulatory-specific neural networks and tissue-specific gradient-boosting trees to build SVEN: a hybrid sequence-oriented architecture that can accurately predict tissue-specific gene expression level and quantify the tissue-specific transcriptomic impacts of structural variants across more than 350 tissues and cell lines.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!