Training set size is an important determinant of genomic prediction accuracy. Plant breeding programs are characterized by a high degree of structuring, particularly into populations. This hampers the establishment of large training sets for each population. Pooling populations increases training set size but ignores unique genetic characteristics of each. A possible solution is partial pooling with multilevel models, which allows estimating population-specific marker effects while still leveraging information across populations. We developed a Bayesian multilevel whole-genome regression model and compared its performance with that of the popular BayesA model applied to each population separately (no pooling) and to the joined data set (complete pooling). As an example, we analyzed a wide array of traits from the nested association mapping maize population. There we show that for small population sizes (e.g., <50), partial pooling increased prediction accuracy over no or complete pooling for populations represented in the training set. No pooling was superior; however, when populations were large. In another example data set of interconnected biparental maize populations either partial or complete pooling was superior, depending on the trait. A simulation showed that no pooling is superior when differences in genetic effects among populations are large and partial pooling when they are intermediate. With small differences, partial and complete pooling achieved equally high accuracy. For prediction of new populations, partial and complete pooling had very similar accuracy in all cases. We conclude that partial pooling with multilevel models can maximize the potential of pooling by making optimal use of information in pooled training sets.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528317 | PMC |
http://dx.doi.org/10.1534/g3.115.019299 | DOI Listing |
J Infect
January 2025
Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, United States.
Background: Pneumococcal conjugate vaccines (PCVs) introduced in childhood national immunization programs lowered vaccine-type invasive pneumococcal disease (IPD), but replacement with non-vaccine-types persisted throughout the PCV10/13 follow-up period. We assessed PCV10/13 impact on pneumococcal meningitis incidence globally.
Methods: The number of cases with serotyped pneumococci detected in cerebrospinal fluid and population denominators were obtained from surveillance sites globally.
Ecol Evol
January 2025
Department of Zoology, Fisheries, Hydrobiology and Apiculture, Faculty of Agronomy Mendel University in Brno Brno Czech Republic.
This study evaluates the response of ground beetle (Coleoptera: Carabidae) assemblage to forest management practices by integrating species composition, body traits, wing morphology and developmental instability. Traditional approaches that rely on averaged identity-based descriptors often overlook phenotypic plasticity and functional trait variability, potentially masking species-specific responses to environmental changes. To address this, we applied a three-layered analytical approach to address this gap, utilising ground beetle occurrence and morphological trait data from Podyjí National Park, Czech Republic.
View Article and Find Full Text PDFBehav Res Methods
January 2025
Methods Center, Eberhard Karls University of Tübingen, Haußerstr. 11, 72076, Tübingen, Germany.
Due to the increased availability of intensive longitudinal data, researchers have been able to specify increasingly complex dynamic latent variable models. However, these models present challenges related to overfitting, hierarchical features, non-linearity, and sample size requirements. There are further limitations to be addressed regarding the finite sample performance of priors, including bias, accuracy, and type I error inflation.
View Article and Find Full Text PDFBr J Math Stat Psychol
January 2025
Department of Community Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.
Recent technological advancements have enabled the collection of intensive longitudinal data (ILD), consisting of repeated measurements from the same individual. The threshold autoregressive (TAR) model is often used to capture the dynamic outcome process in ILD, with autoregressive parameters varying based on outcome variable levels. For ILD from multiple individuals, multilevel TAR (ML-TAR) models have been proposed, with Bayesian approaches typically used for parameter estimation.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!