Background: Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility. Processing of larger and deeper RNA-seq experiments will become more common as sequencing technology matures.

Results: GEMmaker, is a nf-core compliant, Nextflow workflow, that quantifies gene expression from small to massive RNA-seq datasets. GEMmaker ensures results are highly reproducible through the use of versioned containerized software that can be executed on a single workstation, institutional compute cluster, Kubernetes platform or the cloud. GEMmaker supports popular alignment and quantification tools providing results in raw and normalized formats. GEMmaker is unique in that it can scale to process thousands of local or remote stored samples without exceeding available data storage.

Conclusions: Workflows that quantify gene expression are not new, and many already address issues of portability, reusability, and scale in terms of access to CPUs. GEMmaker provides these benefits and adds the ability to scale despite low data storage infrastructure. This allows users to process hundreds to thousands of RNA-seq samples even when data storage resources are limited. GEMmaker is freely available and fully documented with step-by-step setup and execution instructions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9063052PMC
http://dx.doi.org/10.1186/s12859-022-04629-7DOI Listing

Publication Analysis

Top Keywords

gene expression
16
massive rna-seq
8
rna-seq datasets
8
rna-seq data
8
rna-seq experiments
8
hundreds thousands
8
thousands rna-seq
8
data storage
8
gemmaker
7
rna-seq
7

Similar Publications

Sequence analysis of the 5' region of the chymotrypsin C (CTRC) gene in chronic pancreatitis.

Pancreatology

January 2025

Center for Gastroenterology, Department of Medicine, Albert Szent-Györgyi Medical School, University of Szeged, Szeged, Hungary; Hungarian Centre of Excellence for Molecular Medicine - University of Szeged, Translational Pancreatology Research Group, Szeged, Hungary. Electronic address:

Background/objectives: Loss-of-function chymotrypsin C (CTRC) variants increase the risk for chronic pancreatitis (CP) by reducing protective pancreatic CTRC activity. Variants in the 5' upstream region that includes the promoter might affect CTRC expression but have not been investigated to date. The aim of the present study was to address this knowledge gap.

View Article and Find Full Text PDF

Personalized treatment approaches in hepatocellular carcinoma.

Arab J Gastroenterol

January 2025

Endemic Medicine Department, Faculty of Medicine, Helwan University, Cairo, Egypt; Liver Disease Research Center, College of Medicine, King Saud University, Riyadh 11411, Saudi Arabia. Electronic address:

Personalized medicine is an emerging field that provides novel approaches to disease's early diagnosis, prevention, treatment, and prognosis based on the patient's criteria in gene expression, environmental factors, lifestyle, and diet. To date, hepatocellular carcinoma (HCC) is a significant global health burden, with an increasing incidence and significant death rates, despite advancements in surveillance, diagnosis, and therapeutic approaches. The majority of HCC lesions develop in patients with liver cirrhosis, carrying the risks of mortality associated with both the tumor burden and the cirrhosis.

View Article and Find Full Text PDF

Reconciliation of wheat 660 K and 90 K SNP arrays and their utilization in dough rheological properties of bread wheat.

J Adv Res

January 2025

Agronomy College / National Key Laboratory of Wheat and Maize Crop Science, Henan Agricultural University, Zhengzhou 450046 China. Electronic address:

Introduction: High-density Wheat 660 K and 90 K SNP arrays are powerful tools for understanding the genetic basis of wheat traits. However, their inconsistantly physical positions that were caused by different versions of Chinese Spring genome during developing arrays are confused and inconvenient for further application.

Objective: With the repid development of wheat geonome sequencing, we aim to reconciliate Wheat 660 K and 90 K SNP arrays in modern cultivar and reveal the genetic basis of dough rheological properties in bread wheat.

View Article and Find Full Text PDF

Phylogenetic analysis and homology modelling of a new Cry8A crystal protein expressed in a sporulating soil bacterium.

J Struct Biol

January 2025

Postgraduate Program in Industrial Biotechnology, Tiradentes University, Aracaju, Sergipe, Brazil; Department of Morphology, Federal University of Sergipe, São Cristóvão, Sergipe, Brazil. Electronic address:

Cry proteins, commonly found in gram-positive soil bacteria, are used worldwide as aerial sprays or in transgenic plants for controlling crop pest populations and as insect vectors. Via PCR analysis, a spore producing soil isolate (BV5) was speculated to encode a Cry gene. Partial nucleotide sequence of the amplified PCR fragment showed homology with the Cry8 genes present in GenBank.

View Article and Find Full Text PDF

Matrix-mediated activation of murine fibroblast-like synoviocytes.

Exp Cell Res

January 2025

Translational Matrix Biology, University of Cologne, Medical Faculty, Cologne, Germany. Electronic address:

Fibroblast-like synoviocytes (FLS) are key cells promoting cartilage damage and bone loss in rheumatoid arthritis (RA). They are activated to assume an invasive and migratory phenotype. While mechanisms of FLS activation are unknown, evidence suggests that pre-damaged extracellular matrix (ECM) of the cartilage can trigger FLS activation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!