High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology-a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4783016 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0151064 | PLOS |
Biomed Res Int
January 2025
Center for Personalized Nanomedicine, Australian Institute for Bioengineering & Nanotechnology (AIBN), The University of Queensland, Brisbane, Queensland, Australia.
Environmental pollution has been a significant concern for the last few years. The leather industry significantly contributes to the economy but is one of Bangladesh's most prominent polluting industries. It is also responsible for several severe diseases such as cancer, lung diseases, and heart diseases of leather workers because they use bleaching agents and chemicals, and these have numerous adverse effects on human health.
View Article and Find Full Text PDFJ Inflamm Res
January 2025
Department of Geriatric Respiratory and Critical Care, The First Affiliated Hospital of Anhui Medical University, Anhui Geriatric Institute, Hefei, Anhui, People's Republic of China.
Aim: We sought to investigate the impact of CpG oligodeoxynucleotides (CpG-ODN) administration on the lung and gut microbiota in asthmatic mice, specifically focusing on changes in composition, diversity, and abundance, and to elucidate the microbial mechanisms underlying the therapeutic effects of CpG-ODN and identify potential beneficial bacteria indicative of its efficacy.
Methods: HE staining were used to analyze inflammation in lung, colon and small intestine tissues. High-throughput sequencing technology targeting 16S rRNA was employed to analyze the composition, diversity, and correlation of microbiome in the lung, colon and small intestine of control, model and CpG-ODN administration groups.
Biodivers Data J
January 2025
Dynafor, INRAE, INP, ENSAT, 31326, Castanet Tolosan, France Dynafor, INRAE, INP, ENSAT, 31326 Castanet Tolosan France.
Background: DNA barcoding and metabarcoding are now powerful tools for studying biodiversity and especially the accurate identification of large sample collections belonging to diverse taxonomic groups. Their success depends largely on the taxonomic resolution of the DNA sequences used as barcodes and on the reliability of the reference databases. For wild bees, the barcode sequences coverage is consistently growing in volume, but some incorrect species annotations need to be cared for.
View Article and Find Full Text PDFCancer Manag Res
January 2025
Lung Cancer Center, West China Hospital, Sichuan University, Chengdu, People's Republic of China.
Objective: Our research has pinpointed the gut microbiome's role in the progression of various pathological types of non-small cell lung cancer (NSCLC). Nonetheless, the characteristics of the gut microbiome and its metabolites across different clinical stages of NSCLC are yet to be fully understood. The current study seeks to explore the distinctive gut flora and metabolite profiles of NSCLC patients across varying TNM stages.
View Article and Find Full Text PDFFront Antibiot
February 2024
Department of Chemistry, Bioscience and Environmental Engineering, Faculty of Science and Technology, University of Stavanger, Stavanger, Norway.
Wastewater treatment plants receive low concentrations of antibiotics. Residual concentrations of antibiotics in the effluent may accelerate the development of antibiotic resistance in the receiving environments. Monitoring of antimicrobial resistance genes (ARGs) in countries with strict regulation of antibiotic use is important in gaining knowledge of how effective these policies are in preventing the emergence of ARGs or whether other strategies are required, for example, at-source treatment of hospital effluents.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!