Deciphering gene expression patterns using large-scale transcriptomic data and its applications.

Brief Bioinform

School of Mathematics and Statistics, Henan University, Jinming Avenue, 475004, Kaifeng, China.

Published: September 2024

Gene expression varies stochastically across genders, racial groups, and health statuses. Deciphering these patterns is crucial for identifying informative genes, classifying samples, and understanding diseases like cancer. This study analyzes 11,252 bulk RNA-seq samples to explore expression patterns of 19,156 genes, including 10,512 cancer tissue samples and 740 normal samples. Additionally, 4,884 single-cell RNA-seq samples are examined. Statistical analysis using 16 probability distributions shows that normal samples display a wider range of distributions compared to cancer samples. Cancer samples tend to favor asymmetric distributions such as generalized extreme value, logarithmic normal, and Gaussian mixture distributions. In contrast, certain genes in normal samples exhibit symmetric distributions. Remarkably, more than 95.5% of genes exhibit non-normal distributions, which challenges traditional assumptions. Furthermore, distributions differ significantly between bulk and single-cell RNA-seq data. Many cancer driver genes exhibit distinct distribution patterns across sample types, suggesting potential for gene selection and classification based on distribution characteristics. A novel skewness-based metric is proposed to quantify distribution variation across datasets, showing genes with significant skewness differences have biological relevance. Finally, an improved naïve Bayes method incorporating gene-specific distributions demonstrates superior performance in simulations over traditional methods. This work enhances understanding of gene expression and its application in omics-based gene selection and sample classification.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11562847PMC
http://dx.doi.org/10.1093/bib/bbae590DOI Listing

Publication Analysis

Top Keywords

gene expression
12
normal samples
12
samples
9
expression patterns
8
rna-seq samples
8
single-cell rna-seq
8
distributions
8
cancer samples
8
genes exhibit
8
gene selection
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!