Disentangling cobionts and contamination in long-read genomic data using sequence composition.

G3 (Bethesda)

Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.

Published: November 2024

The recent acceleration in genome sequencing targeting previously unexplored parts of the tree of life presents computational challenges. Samples collected from the wild often contain sequences from several organisms, including the target, its cobionts, and contaminants. Effective methods are therefore needed to separate sequences. Though advances in sequencing technology make this task easier, it remains difficult to taxonomically assign sequences from eukaryotic taxa that are not well represented in databases. Therefore, reference-based methods alone are insufficient. Here, I examine how we can take advantage of differences in sequence composition between organisms to identify symbionts, parasites, and contaminants in samples, with minimal reliance on reference data. To this end, I explore data from the Darwin Tree of Life project, including hundreds of high-quality HiFi read sets from insects. Visualizing two-dimensional representations of read tetranucleotide composition learned by a variational autoencoder can reveal distinct components of a sample. Annotating the embeddings with additional information, such as coding density, estimated coverage, or taxonomic labels allows rapid assessment of the contents of a dataset. The approach scales to millions of sequences, making it possible to explore unassembled read sets, even for large genomes. Combined with interactive visualization tools, it allows a large fraction of cobionts reported by reference-based screening to be identified. Crucially, it also facilitates retrieving genomes for which suitable reference data are absent.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11540323PMC
http://dx.doi.org/10.1093/g3journal/jkae187DOI Listing

Publication Analysis

Top Keywords

sequence composition
8
tree life
8
reference data
8
read sets
8
disentangling cobionts
4
cobionts contamination
4
contamination long-read
4
long-read genomic
4
data
4
genomic data
4

Similar Publications

The Hepatincolaceae (Alphaproteobacteria) are a group of bacteria that inhabit the gut of arthropods and other ecdysozoans, associating extracellularly with microvilli. Previous phylogenetic studies, primarily single-gene analyses, suggested their relationship to the Holosporales, which includes intracellular bacteria in protist hosts. However, the genomics of Hepatincolaceae is still in its early stages.

View Article and Find Full Text PDF

A Randomized Pilot Study of Time-Restricted Eating Shows Minimal Microbiome Changes.

Nutrients

January 2025

Division of Diabetes, Endocrinology and Metabolism, Department of Medicine, University of Minnesota, Minneapolis, MN 55455, USA.

Objective: TRE is an emerging approach in obesity treatment, yet there is limited data on how it influences gut microbiome composition in humans. Our objective was to characterize the gut microbiome of human participants before and after a TRE intervention. This is a secondary analysis of a previously published clinical trial examining the effects of time-restricted eating (TRE).

View Article and Find Full Text PDF

Background/objectives: Gastric cancer (GC) incidence remains high worldwide, and the survival rate is poor. GC develops from atrophic gastritis (AG), associated with () infection, passing through intestinal metaplasia and dysplasia steps. Since eradication does not exclude GC development, further investigations are needed.

View Article and Find Full Text PDF

Background: The relationship between gut microbiota composition, lifestyles, and colonic transit time (CTT) remains poorly understood. This study investigated associations among gut microbiota profiles, diet, lifestyles, and CTT in individuals with subjective constipation.

Methods: We conducted a secondary analysis of data from our randomized clinical trial, examining gut microbiota composition, CTT, and dietary intake in baseline and final assessments of 94 participants with subjective constipation.

View Article and Find Full Text PDF

The connection between gut microbiota and factors like diet is crucial for maintaining intestinal balance, which in turn impacts the host's overall health. microalgae is a sustainable source of bioactive compounds, mainly known for its used in aquaculture and extraction of bioactive lipids, with potential health benefits whose effects on human gut microbiota are still unknown. Therefore, the goal of this work was to assess the impact of on human gut microbiota composition and derived metabolites by combining the INFOGEST protocol and in vitro colonic fermentation process to evaluate potential effects on human gut microbiota conformation through 16S rRNA gene sequencing and its metabolic functionality.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!