The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154105PMC
http://dx.doi.org/10.1371/journal.pbio.3001669DOI Listing

Publication Analysis

Top Keywords

statistical inference
8
population genomics
8
genomic data
8
population
5
data
5
recommendations improving
4
improving statistical
4
inference population
4
genomics field
4
field population
4

Similar Publications

Background: Practice guidelines recommend patient management based on scientific evidence. Quality indicators gauge adherence to such recommendations and assess health care quality. They are usually defined as adverse event rates, which may not fully capture guideline adherence over time.

View Article and Find Full Text PDF

The article is motivated by an application to the EarlyBird cohort study aiming to explore how anthropometrics and clinical and metabolic processes are associated with obesity and glucose control during childhood. There is interest in inferring the relationship between dynamically changing and high-dimensional metabolites and a longitudinal response. Important aspects of the analysis include the selection of the important set of metabolites and the accommodation of missing data in both response and covariate values.

View Article and Find Full Text PDF

Background: In causal analyses, some third factor may distort the relationship between the exposure and the outcome variables under study, which gives spurious results. In this case, treatment groups and control groups that receive and do not receive the exposure are different from one another in some other essential variables, called confounders.

Method: Place of birth was used as exposure variable and age-specific childhood vaccination status was used as outcome variables.

View Article and Find Full Text PDF

Background: Autoimmune disorders have primary manifestations such as joint pain and bowel inflammation but can also have secondary manifestations such as non-infectious uveitis (NIU). A regulatory health authority raised concerns after receiving spontaneous reports for NIU following exposure to Remicade, a biologic therapy with multiple indications for which alternative therapies are available. In assessment of this clinical question, we applied validity diagnostics to support observational data causal inferences.

View Article and Find Full Text PDF

Cross-species regulatory network analysis identifies FOXO1 as a driver of ovarian follicular recruitment.

Sci Rep

December 2024

Departments of Animal and Food Sciences, Biological Sciences, Medical and Molecular Sciences, and Microbiology Graduate Program, University of Delaware, Newark, DE, USA.

The transcriptional regulation of gene expression in the latter stages of follicular development in laying hen ovarian follicles is not well understood. Although differentially expressed genes (DEGs) have been identified in pre-recruitment and pre-ovulatory stages, the master regulators driving these DEGs remain unknown. This study addresses this knowledge gap by utilizing Master Regulator Analysis (MRA) combined with the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) for the first time in laying hen research to identify master regulators that are controlling DEGs in pre-recruitment and pre-ovulatory phases.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!