iSeq: an integrated tool to fetch public sequencing data.

Haoyu Chao Zhuojin Li Dijun Chen Ming Chen

Bioinformatics

Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou 310058, China.

Published: November 2024

Motivation: High-throughput sequencing technologies [next-generation sequencing (NGS)] are increasingly used to address diverse biological questions. Despite the rich information in NGS data, particularly with the growing datasets from repositories like the Genome Sequence Archive (GSA) at NGDC, programmatic access to public sequencing data and metadata remains limited.

Results: We developed iSeq to enable quick and straightforward retrieval of metadata and NGS data from multiple databases via the command-line interface. iSeq supports simultaneous retrieval from GSA, SRA, ENA, and DDBJ databases. It handles over 25 different accession formats, supports Aspera downloads, parallel downloads, multi-threaded processes, FASTQ file merging, and integrity verification, simplifying data acquisition and enhancing the capacity for reanalyzing NGS data.

Availability And Implementation: iSeq is freely available on Bioconda (https://anaconda.org/bioconda/iseq) and GitHub (https://github.com/BioOmics/iSeq).

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11561040	PMC
http://dx.doi.org/10.1093/bioinformatics/btae641	DOI Listing

Publication Analysis

Top Keywords

public sequencing

sequencing data

ngs data

data

iseq

iseq integrated

integrated tool

tool fetch

fetch public

sequencing

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!