Plasmids are mobile genetic elements that carry important accessory genes. Cataloging plasmids is a fundamental step to elucidate their roles in promoting horizontal gene transfer between bacteria. Next generation sequencing (NGS) is the main source for discovering new plasmids today. However, NGS assembly programs tend to return contigs, making plasmid detection difficult. This problem is particularly grave for metagenomic assemblies, which contain short contigs of heterogeneous origins. Available tools for plasmid contig detection still suffer from some limitations. In particular, alignment-based tools tend to miss diverged plasmids while learning-based tools often have lower precision. In this work, we develop a plasmid detection tool PLASMe that capitalizes on the strength of alignment and learning-based methods. Closely related plasmids can be easily identified using the alignment component in PLASMe while diverged plasmids can be predicted using order-specific Transformer models. By encoding plasmid sequences as a language defined on the protein cluster-based token set, Transformer can learn the importance of proteins and their correlation through positionally token embedding and the attention mechanism. We compared PLASMe and other tools on detecting complete plasmids, plasmid contigs, and contigs assembled from CAMI2 simulated data. PLASMe achieved the highest F1-score. After validating PLASMe on data with known labels, we also tested it on real metagenomic and plasmidome data. The examination of some commonly used marker genes shows that PLASMe exhibits more reliable performance than other tools.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10450166 | PMC |
http://dx.doi.org/10.1093/nar/gkad578 | DOI Listing |
Environ Microbiol
January 2025
Department of Civil, Environmental, and Geo-Engineering, University of Minnesota, Minneapolis, Minnesota, USA.
Shotgun and proximity-ligation metagenomic sequencing were used to generate thousands of metagenome assembled genomes (MAGs) from the untreated wastewater, activated sludge bioreactors, and anaerobic digesters from two full-scale municipal wastewater treatment facilities. Analysis of the antibiotic resistance genes (ARGs) in the pool of contigs from the shotgun metagenomic sequences revealed significantly different relative abundances and types of ARGs in the untreated wastewaster compared to the activated sludge bioreactors or the anaerobic digesters (p < 0.05).
View Article and Find Full Text PDFNAR Genom Bioinform
March 2025
Departments of Medicine and Pediatrics, Division of Infectious Diseases and Global Health, University of California San Francisco School of Medicine, 550 16th Street, 4th Floor Mission Hall, San Francisco, CA, 94158, USA.
Whole genome sequencing (WGS) is pivotal for the molecular characterization of ()-the leading bacterial cause of sexually transmitted infections and infectious blindness worldwide. WGS can inform epidemiologic, public health and outbreak investigations of these human-restricted pathogens. However, challenges persist in generating high-quality genomes for downstream analyses given its obligate intracellular nature and difficulty with propagation.
View Article and Find Full Text PDFLife (Basel)
December 2024
Department of Biomedical Sciences and Biomedical Engineering, Faculty of Medicine, Prince of Songkla University, Songkhla 90110, Thailand.
strains S3W10 and SS15, isolated from shrimp ponds, exhibit potential probiotic benefits for aquaculture. In this study, the genomic features of S3W10 and SS15 were thoroughly characterized to evaluate their probiotic properties and safety for aquaculture use. The genomes of S3W10 and SS15 consist of 130 and 74 contigs, with sizes of 4.
View Article and Find Full Text PDFAccess Microbiol
November 2024
Biosciences, University of Exeter, Exeter, UK.
This Technical Resource describes genome sequencing data for 61 isolates of the bacterial pathogen pv. collected from and crops between 2010 and 2021 in Serbia. We present the raw sequencing reads and annotated contig-level genome assemblies and determine the races of ten isolates.
View Article and Find Full Text PDFJ Hazard Mater
December 2024
Key Laboratory of Northwest Water Resource, Environment and Ecology, MOE, Xi'an University of Architecture and Technology, Xi'an 710055, China; Shaanxi Key Laboratory of Environmental Engineering, Xi'an University of Architecture and Technology, Xi'an 710055, China. Electronic address:
The proliferation and dissemination of antibiotic resistance genes (ARGs) in source water reservoirs may pose a threat to human health. This study investigated the antibiotic resistance in stratified reservoirs in China across different seasons and spatial locations. In total, 120 ARG subtypes belonging to 15 ARG types were detected with an abundance ranging from 171.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!