Background: Current methods used for annotating metagenomics shotgun sequencing (MGS) data rely on a computationally intensive and low-stringency approach of mapping each read to a generic database of proteins or reference microbial genomes.
Results: We developed MGS-Fast, an analysis approach for shotgun whole-genome metagenomic data utilizing Bowtie2 DNA-DNA alignment of reads that is an alternative to using the integrated catalog of reference genes database of well-annotated genes compiled from human microbiome data. This method is rapid and provides high-stringency matches (>90% DNA sequence identity) of the metagenomics reads to genes with annotated functions.
The availability of low-cost small-factor sequencers, such as the Illumina MiSeq, MiniSeq, or iSeq, have paved the way for democratizing genomics sequencing, providing researchers in minority universities with access to the technology that was previously only affordable by institutions with large core facilities. However, these instruments are not bundled with software for performing bioinformatics data analysis, and the data analysis can be the main bottleneck for independent laboratories or even small clinical facilities that consider adopting genomic sequencing for medical applications. To address this issue, we have developed miCloud, a bioinformatics platform that enables genomic data analysis through a fully featured data analysis cloud, which seamlessly integrates with genome sequencers over the local network.
View Article and Find Full Text PDFProcessing of next-generation sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized postanalysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers toward seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. We present an approach for abstracting the complex data operations of multistep, bioinformatics pipelines for NGS data analysis.
View Article and Find Full Text PDF