Motivation: Sanger sequencing of taxonomic marker genes (e.g. 16S/18S/ITS/rpoB/cpn60) represents the leading method for identifying a wide range of microorganisms including bacteria, archaea, and fungi. However, the manual processing of sequence data and limitations associated with conventional BLAST searches impede the efficient generation of strain libraries essential for cataloging microbial diversity and discovering novel species.

Results: isolateR addresses these challenges by implementing a standardized and scalable three-step pipeline that includes: (1) automated batch processing of Sanger sequence files, (2) taxonomic classification via global alignment to type strain databases in accordance with the latest international nomenclature standards, and (3) straightforward creation of strain libraries and handling of clonal isolates, with the ability to set customizable sequence dereplication thresholds and combine data from multiple sequencing runs into a single library. The tool's user-friendly design also features interactive HTML outputs that simplify data exploration and analysis. Additionally, in silico benchmarking done on two comprehensive human gut genome catalogues (IMGG and Hadza hunter-gather populations) showcase the proficiency of isolateR in uncovering and cataloging the nuanced spectrum of microbial diversity, advocating for a more targeted and granular exploration within individual hosts to achieve the highest strain-level resolution possible when generating culture collections.

Availability And Implementation: isolateR is available at: https://github.com/bdaisley/isolateR.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11254302PMC
http://dx.doi.org/10.1093/bioinformatics/btae448DOI Listing

Publication Analysis

Top Keywords

sanger sequencing
8
strain libraries
8
microbial diversity
8
isolater
4
isolater package
4
package generating
4
generating microbial
4
microbial libraries
4
libraries sanger
4
data
4

Similar Publications

High clinical utility of long-read sequencing for precise diagnosis of congenital adrenal hyperplasia in 322 probands.

Hum Genomics

January 2025

Department of Endocrine and Metabolic Diseases, Children's Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China.

Background: The molecular genetic diagnosis of congenital adrenal hyperplasia (CAH) is very challenging due to the high homology between the CYP21A2 gene and its pseudogene CYP21A1P.

Methodology: This study aims to assess the clinical efficacy of targeted long-read sequencing (T-LRS) by comparing it with a control method based on the combined assay (NGS, Multiplex ligation-dependent probe amplification and Sanger sequencing) and to introduce T-LRS as a first-tier diagnostic test for suspected CAH patients to improve the precise diagnosis of CAH.

Results: A large cohort of 562 participants including 322 probands and 240 family members was enrolled for the perspective (96 probands) and prospective study (226 probands).

View Article and Find Full Text PDF

Background: Head and neck squamous cell carcinoma (HNSCC), a highly invasive malignancy with a poor prognosis, is one of the most common cancers globally. Circular RNAs (circRNAs) have become key regulators of human malignancies, but further studies are necessary to fully understand their functions and possible causes in HNSCC.

Methods: CircCCT2 expression levels in HNSCC tissues and cells were measured via qPCR.

View Article and Find Full Text PDF

Background: Clear cell renal cell carcinoma (ccRCC) is a prevalent and aggressive malignancy, with the von Hippel-Lindau (VHL) gene playing a critical role in its pathogenesis. However, the association between VHL gene variants and sporadic ccRCC risk remains unexplored in the Indian population. This study aimed to investigate the somatic and germline variants of the VHL gene in sporadic ccRCC patients from West Bengal, India, and their association with disease risk and clinicopathological parameters.

View Article and Find Full Text PDF

Study of the Prevalence of Human Papillomavirus Genotypes in Jeddah, Saudi Arabia.

J Epidemiol Glob Health

January 2025

Special Infectious Agents Unit-BSL3, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, 21589, Saudi Arabia.

Human papillomavirus (HPV), a common sexually transmitted infection, includes over 200 types, some linked to genital warts and various cancers, including cervical, anal, penile, and oropharyngeal cancers. In Saudi Arabia, an estimated 10.7 million women aged 15 years and older are at risk of HPV-related cervical cancer.

View Article and Find Full Text PDF

Background: Familial hyperlipidemia (familial hypercholesterolemia, FH) is an autosomal genetic disorder. It includes type heterozygous familial hyperlipidemia (heterozygous familial hypercholesterolemia). HeFH is mainly caused by mutations in the LDLR, APOB, and PCSK9 genes and is characterized by elevated plasma low-density lipoprotein cholesterol levels.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!