Understanding natural selection in humans and other species is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically requires slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Mismatches between simulated training data and real test data can lead to incorrect inference. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection that requires relatively few selection simulations during training. We use a Generative Adversarial Network (GAN) trained to simulate realistic neutral data. The resulting GAN consists of a generator (fitted demographic model) and a discriminator (convolutional neural network). For a genomic region, the discriminator predicts whether it is "real" or "fake" in the sense that it could have been simulated by the generator. As the "real" training data includes regions that experienced selection and the generator cannot produce such regions, regions with a high probability of being real are likely to have experienced selection. To further incentivize this behavior, we "fine-tune" the discriminator with a small number of selection simulations. We show that this approach has high power to detect selection in simulations, and that it finds regions under selection identified by state-of-the art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics. In summary, our approach is a novel, efficient, and powerful way to use machine learning to detect natural selection.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10028936PMC
http://dx.doi.org/10.1101/2023.03.07.531546DOI Listing

Publication Analysis

Top Keywords

selection
12
natural selection
12
training data
12
selection simulations
12
generative adversarial
8
machine learning
8
simulated training
8
interpret trained
8
detect selection
8
experienced selection
8

Similar Publications

Introduction: Body mass index (BMI) has been implicated in various cardiovascular conditions, but its association with peripheral artery disease (PAD) in both real-world and genetic studies have been contentious and debated.

Methods: This study enrolled 6707 individuals from the National Health and Nutrition Examination Survey database to investigate the association between BMI and the risk of PAD. The weighted logistic regression, restricted cubic spline, and subgroup analysis were performed using real-world data.

View Article and Find Full Text PDF

l-theanine: From tea leaf to trending supplement - does the science match the hype for brain health and relaxation?

Nutr Res

January 2025

Department of Molecular Medicine, University of Padova, Padova, Italy; IMDEA-Food, Madrid, Spain. Electronic address:

l-Theanine is a unique non-protein amino acid found abundantly in tea leaves. Interest in its potential use as a dietary supplement has surged recently, especially claims related to promoting relaxation and cognitive enhancement. This review surveys the chemistry, metabolism, and purported biological activities of l-theanine.

View Article and Find Full Text PDF

Lophine incorporated azo dye as a novel colorimetric sensor for multiple metal ions.

Spectrochim Acta A Mol Biomol Spectrosc

January 2025

Laboratory of Organic Chemistry, Tarsadia Institute of Chemical Science, Uka Tarsadia University, Maliba Campus, Gopal Vidyanagar, Bardoli-Mahuva Road, Tarsadi 394650 Surat Gujarat India. Electronic address:

A single molecule sensor for several analytes is indeed desired by the scientists around the world due to obvious advantages. In this report we present a new class of Lophine incorporated azo dyes that has capacity of differential colorimetric detection of several metal ions. Interestingly the sensor was found to have pH dependent selective response towards several metals.

View Article and Find Full Text PDF

Importance: Endocrine treatments, such as Tamoxifen (TAM) and/or Aromatase inhibitors (AI), are the adjuvant therapy of choice for hormone-receptor positive breast cancer. These agents are associated with menopausal symptoms, adversely affecting drug compliance. Topical estrogen (TE) has been proposed for symptom management, given its' local application and presumed reduced bioavailability, however its oncological safety remains uncertain.

View Article and Find Full Text PDF

To address the challenge of antibiotic-containing wastewater, a novel micromagnetic carrier-modified integrated fixed-film activated sludge system (MC-IFAS) was developed for treating tetracycline (TC)-containing swine wastewater in this study. The magnetic effects of the MC significantly enhanced TC removal by improving TC biosorption and biodegradation in both the suspended activated sludge and the carrier-attached biofilm in the MC-IFAS. The increased electrostatic attraction and number of binding sites in both the activated sludge and the biofilm enhanced their TC biosorption capacities, particularly in the activated sludge.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!