Microbial source tracking leverages a wide range of approaches designed to trace the origins of fecal contamination in aquatic environments. Although source tracking methods are typically employed within the laboratory setting, computational techniques can be leveraged to advance microbial source tracking methodology. Herein, we present a logic regression-based supervised learning approach for the discovery of source-informative genetic markers within intergenic regions across the genome that can be used for source tracking. With just single intergenic loci, logic regression was able to identify highly source-specific (i.e., exceeding 97.00%) biomarkers for a wide range of host and niche sources, with sensitivities reaching as high as 30.00%-50.00% for certain source categories, including pig, sheep, mouse, and wastewater, depending on the specific intergenic locus analyzed. Restricting the source range to reflect the most prominent zoonotic sources of transmission (i.e., bovine, chicken, human, and pig) allowed for the generation of informative biomarkers for all host categories, with specificities of at least 90.00% and sensitivities between 12.50% and 70.00%, using the sequence data from key intergenic regions, including , and , that appear to be involved in antibiotic resistance. Remarkably, we were able to use this approach to classify 48 out of 113 river water isolates collected in Northwestern Sweden as either beaver, human, or reindeer in origin with a high degree of consensus-thus highlighting the potential of logic regression modeling as a novel approach for augmenting current source tracking efforts.IMPORTANCEThe presence of microbial contaminants, particularly from fecal sources, within water poses a serious risk to public health. The health and economic burden of waterborne pathogens can be substantial-as such, the ability to detect and identify the sources of fecal contamination in environmental waters is crucial for the control of waterborne diseases. This can be accomplished through microbial source tracking, which involves the use of various laboratory techniques to trace the origins of microbial pollution in the environment. Building on current source tracking methodology, we describe a novel workflow that uses logic regression, a supervised machine learning method, to discover genetic markers in , a common fecal indicator bacterium, that can be used for source tracking efforts. Importantly, our research provides an example of how the rise in prominence of machine learning algorithms can be applied to improve upon current microbial source tracking methodology.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11267920PMC
http://dx.doi.org/10.1128/aem.00227-24DOI Listing

Publication Analysis

Top Keywords

source tracking
40
microbial source
20
source
12
tracking methodology
12
logic regression
12
tracking
10
logic regression-based
8
approach discovery
8
wide range
8
trace origins
8

Similar Publications

Background: In persistent atrial fibrillation (AF), localized extra-pulmonary vein sources may contribute to arrhythmia recurrences after pulmonary vein isolation. This in-silico study proposes a high-density sequential mapping strategy to localize such sources.

Method: Catheter repositioning was guided by repetitive conduction patterns, moving against the prevailing conduction direction (upstream) toward the sources.

View Article and Find Full Text PDF

gscramble: Simulation of Admixed Individuals Without Reuse of Genetic Material.

Mol Ecol Resour

January 2025

United States Department of Agriculture, Wildlife Services, National Wildlife Research Center, Fort Collins, Colorado, USA.

While a best practice for evaluating the behaviour of genetic clustering algorithms on empirical data is to conduct parallel analyses on simulated data, these types of simulation techniques often involve sampling genetic data with replacement. In this paper we demonstrate that sampling with replacement, especially with large marker sets, inflates the perceived statistical power to correctly assign individuals (or the alleles that they carry) back to source populations-a phenomenon we refer to as resampling-induced, spurious power inflation (RISPI). To address this issue, we present gscramble, a simulation approach in R for creating biologically informed individual genotypes from empirical data that: (1) samples alleles from populations without replacement and (2) segregates alleles based on species-specific recombination rates.

View Article and Find Full Text PDF

Optimising lighting conditions to enhance seafarer adaptation to the '6-h on/6-h off' shift pattern: a balanced crossover study.

J Sleep Res

January 2025

Department of Light Sources and Illuminating Engineering, School of Information Science and Technology, Fudan University, Shanghai, China.

The '6-h on/6-h off' shift pattern could potentially disrupt the physiological rhythms and cognitive performance of seafarers, attributed to its shorter and more frequent shifts. Conversely, light exposure has been demonstrated to enhance cognitive abilities and synchronise physiological processes. Therefore, we studied the fatigue, cognition, sleep and rhythm of seafarers with different shifts to determine how light can benefit their performance.

View Article and Find Full Text PDF

Deep oil reservoirs are becoming increasingly significant fields of hydrocarbon exploration in recent decades. Hydrothermal fluid flow is deemed as a potentially crucial factor affecting the occurrence of deep oil reservoirs, such as enhancing porosity/permeability of reservoirs, accelerating oil generation and thermal cracking, and modifying organic properties of crude oils. Understanding the interplay between hydrothermal fluids and crude oils would provide useful constraints for reconstructing hydrocarbon accumulation processes and predicting the distribution patterns of crude oils.

View Article and Find Full Text PDF

Official development agencies are increasingly supporting civil society lobby and advocacy (L&A) to address poverty and human rights. However, there are challenges in evaluating L&A. As programme objectives are often to change policies or practices in a single institution like a Government Ministry, L&A programmes are often not amenable to large-n impact evaluation methods.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!