The bench scientist's guide to statistical analysis of RNA-Seq data.

BMC Res Notes

USDA ARS Global Change and Photosynthesis Research Unit, 1201 W. Gregory Drive, Urbana, IL 61801, USA.

Published: September 2012

Background: RNA sequencing (RNA-Seq) is emerging as a highly accurate method to quantify transcript abundance. However, analyses of the large data sets obtained by sequencing the entire transcriptome of organisms have generally been performed by bioinformatics specialists. Here we provide a step-by-step guide and outline a strategy using currently available statistical tools that results in a conservative list of differentially expressed genes. We also discuss potential sources of error in RNA-Seq analysis that could alter interpretation of global changes in gene expression.

Findings: When comparing statistical tools, the negative binomial distribution-based methods, edgeR and DESeq, respectively identified 11,995 and 11,317 differentially expressed genes from an RNA-seq dataset generated from soybean leaf tissue grown in elevated O3. However, the number of genes in common between these two methods was only 10,535, resulting in 2,242 genes determined to be differentially expressed by only one method. Upon analysis of the non-significant genes, several limitations of these analytic tools were revealed, including evidence for overly stringent parameters for determining statistical significance of differentially expressed genes as well as increased type II error for high abundance transcripts.

Conclusions: Because of the high variability between methods for determining differential expression of RNA-Seq data, we suggest using several bioinformatics tools, as outlined here, to ensure that a conservative list of differentially expressed genes is obtained. We also conclude that despite these analytical limitations, RNA-Seq provides highly accurate transcript abundance quantification that is comparable to qRT-PCR.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3522531PMC
http://dx.doi.org/10.1186/1756-0500-5-506DOI Listing

Publication Analysis

Top Keywords

differentially expressed
20
expressed genes
16
rna-seq data
8
highly accurate
8
transcript abundance
8
statistical tools
8
conservative list
8
list differentially
8
genes
7
rna-seq
6

Similar Publications

Aim: This study aimed to identify the genes associated with the development of lung adenocarcinoma (LUAD) and potential therapeutic targets.

Methods: Differentially expressed genes (DEGs) were identified by self-transcriptome sequencing of tumor tissues and paracancerous tissues resected during surgery and combined with The Cancer Genome Atlas (TCGA) data to screen for the genes associated with LUAD prognosis. The expression was validated at mRNA and protein levels, and the gene knockdown was used to examine the impact and underlying mechanisms on lung cancer cells.

View Article and Find Full Text PDF

The immune system has emerged as a major factor in the pathogenesis of Alzheimer's disease (AD). PANoptosis is a newly defined programmed cell death mechanism related to many inflammatory diseases. This study aimed to identify the differentially expressed (DE) PANoptosis-related genes with characteristics of immune dysregulation (PRGIDs) in AD using bioinformatics analysis of bulk RNA-seq and single-nuclei RNA sequencing (snRNA-seq) data.

View Article and Find Full Text PDF

As an abiotic stress factor, salinity significantly affects the physiological activities of crustaceans. In this study, transcriptome sequencing was used to evaluate the mechanism of ion transport and the physiological response of black tiger shrimp (Penaeus monodon) under low salt stress. Four hundred post larval (PL) stage P.

View Article and Find Full Text PDF

Comparative transcriptome and metabolome analysis of sweet potato ( (L.) Lam.) tuber development.

Front Plant Sci

January 2025

Institute of Food Crops, Hainan Academy of Agricultural Sciences/Hainan Key Laboratory of Crop Genetics and Breeding, Haikou, China.

Introduction: Sweet potato is an important food, feed and industrial raw material, and its tubers are rich in starch, carotenoids and anthocyanins.

Methods: To elucidate the gene expression regulation and metabolic characteristics during the development of sweet potato tubers, transcriptomic and metabolomic analyses were performed on the tubers of three different sweet potato varieties at three developmental stages (70, 100, and 130 days (d)).

Results: RNA-seq analysis revealed that 16,303 differentially expressed genes (DEGs) were divided into 12 clusters according to their expression patterns, and the pathways of each cluster were annotated.

View Article and Find Full Text PDF

Introduction: The deficiency of estrogen correlates with a range of diseases, notably Postmenopausal osteoporosis (PMO) and Parkinson's disease (PD). There is a possibility that PMO and PD may share underlying molecular mechanisms that are pivotal in their development and progression. The objective of this study was to identify critical genes and potential mechanisms associated with PMO by examining co-expressed genes linked to PD.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!