Reproducible acquisition, management and meta-analysis of nucleotide sequence (meta)data using q2-fondue.

Bioinformatics

Laboratory of Food Systems Biotechnology, Institute of Food, Nutrition, and Health, ETH Zürich, Zürich 8092, Switzerland.

Published: November 2022

Motivation: The volume of public nucleotide sequence data has blossomed over the past two decades and is ripe for re- and meta-analyses to enable novel discoveries. However, reproducible re-use and management of sequence datasets and associated metadata remain critical challenges. We created the open source Python package q2-fondue to enable user-friendly acquisition, re-use and management of public sequence (meta)data while adhering to open data principles.

Results: q2-fondue allows fully provenance-tracked programmatic access to and management of data from the NCBI Sequence Read Archive (SRA). Unlike other packages allowing download of sequence data from the SRA, q2-fondue enables full data provenance tracking from data download to final visualization, integrates with the QIIME 2 ecosystem, prevents data loss upon space exhaustion and allows download of (meta)data given a publication library. To highlight its manifold capabilities, we present executable demonstrations using publicly available amplicon, whole genome and metagenome datasets.

Availability And Implementation: q2-fondue is available as an open-source BSD-3-licensed Python package at https://github.com/bokulich-lab/q2-fondue. Usage tutorials are available in the same repository. All Jupyter notebooks used in this article are available under https://github.com/bokulich-lab/q2-fondue-examples.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9665871PMC
http://dx.doi.org/10.1093/bioinformatics/btac639DOI Listing

Publication Analysis

Top Keywords

nucleotide sequence
8
sequence metadata
8
data
8
sequence data
8
re-use management
8
python package
8
sequence
6
q2-fondue
5
reproducible acquisition
4
management
4

Similar Publications

NFKB1 as a key player in Tumor biology: from mechanisms to therapeutic implications.

Cell Biol Toxicol

January 2025

Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, No. 36 Sanhao Street, Heping District, Shenyang , Liaoning Province, China.

NFKB1, a core transcription factor critical in various biological process (BP), is increasingly studied for its role in tumors. This research combines literature reviews, meta-analyses, and bioinformatics to systematically explore NFKB1's involvement in tumor initiation and progression. A unique focus is placed on the NFKB1-94 ATTG promoter polymorphism, highlighting its association with cancer risk across diverse genetic models and ethnic groups, alongside comprehensive analysis of pan-cancer expression patterns and drug sensitivity.

View Article and Find Full Text PDF

Clear cell renal cell carcinoma (ccRCC) is a highly malignant tumor characterized by a significant propensity for recurrence and metastasis. DNA methylation has emerged as a critical epigenetic mechanism with substantial utility in cancer diagnosis. In this study, multi-omics data were utilized to investigate the target genes regulated by the transcription factor MYC-associated zinc finger protein (MAZ) in ccRCC, leading to the identification of thymidine phosphorylase (TYMP) as a gene with notably elevated expression in ccRCC.

View Article and Find Full Text PDF

Altered DNA dynamics at lesion sites are implicated in how DNA repair proteins sense damage within genomic DNA. Using laser temperature-jump (T-jump) spectroscopy combined with cytosine-analog Förster Resonance Energy Transfer (FRET) probes that sense local DNA conformations, we measured the intrinsic dynamics of DNA containing 3 base-pair mismatches recognized in vitro by Rad4 (yeast ortholog of XPC). Rad4/XPC recognizes diverse lesions from environmental mutagens and initiates nucleotide excision repair.

View Article and Find Full Text PDF

A mobile genetic element-derived primase-polymerase harbors multiple activities implicated in DNA replication and repair.

Nucleic Acids Res

January 2025

State Key Laboratory of Agricultural Microbiology and College of Life Science and Technology, Hubei Hongshan Laboratory, Huazhong Agricultural University, Shizishan Road No.1, Hongshan District, 430070 Wuhan, China.

Primase-polymerases (PrimPols) play divergent functions from DNA replication to DNA repair in all three life domains. In archaea and bacteria, numerous and diverse PPs are encoded by mobile genetic elements (MGEs) and act as the replicases for their MGEs. However, their varying activities and functions are not fully understood.

View Article and Find Full Text PDF

The expansion of single-cell analytical techniques has empowered the exploration of diverse biological questions at the individual cells. Droplet-based single-cell RNA sequencing (scRNA-seq) methods have been particularly widely used due to their high-throughput capabilities and small reaction volumes. While commercial systems have contributed to the widespread adoption of droplet-based scRNA-seq, their relatively high cost limits the ability to profile large numbers of cells and samples.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!