Distance-based differential analysis of gene curves.

Bioinformatics

Statistics Section, Department of Mathematics, Imperial College London, London, UK.

Published: November 2011

AI Article Synopsis

  • Time course gene expression experiments analyze changes in mRNA levels across many genes, using statistical methods from functional data analysis (FDA) to model these time-varying profiles as smooth curves.
  • A new test statistic is introduced to identify significant differences between gene expression curves, relying on a distance measure and extending traditional MANOVA tests for vectorial data.
  • The visual l₂ distance is highlighted for its ability to detect shape-related differences in curves while being robust against time shifts, and the method has been applied to study immune cell responses to bacterial infections over time.

Article Abstract

Motivation: Time course gene expression experiments are performed to study time-varying changes in mRNA levels of thousands of genes. Statistical methods from functional data analysis (FDA) have recently gained popularity for modelling and exploring such time courses. Each temporal profile is treated as the realization of a smooth function of time, or curve, and the inferred curve becomes the basic unit of statistical analysis. The task of identifying genes with differential temporal profiles then consists of detecting statistically significant differences between curves, where such differences are commonly quantified by computing the area between the curves or the l₂ distance.

Results: We propose a general test statistic for detecting differences between gene curves, which only depends on a suitably chosen distance measure between them. The test makes use of a distance-based variance decomposition and generalizes traditional MANOVA tests commonly used for vectorial observations. We also introduce the visual l₂ distance, which is shown to capture shape-related differences in gene curves and is robust against time shifts, which would otherwise inflate the traditional l₂ distance. Other shape-related distances, such as the curvature, may carry biological significance. We have assessed the comparative performance of the test on realistically simulated datasets and applied it to human immune cell responses to bacterial infection over time.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btr528DOI Listing

Publication Analysis

Top Keywords

gene curves
12
differences gene
8
l₂ distance
8
curves
5
time
5
distance-based differential
4
differential analysis
4
gene
4
analysis gene
4
curves motivation
4

Similar Publications

Screening of obstructive sleep apnea and diabetes mellitus -related biomarkers based on integrated bioinformatics analysis and machine learning.

Sleep Breath

January 2025

Department of Respiratory and Critical Care Medicine, Medical School of Nantong University, Nantong Key Laboratory of Respiratory Medicine, Affiliated Hospital of Nantong University, Nantong, 226001, China.

Background: The pathophysiology of obstructive sleep apnea (OSA) and diabetes mellitus (DM) is still unknown, despite clinical reports linking the two conditions. After investigating potential roles for DM-related genes in the pathophysiology of OSA, our goal is to investigate the molecular significance of the condition. Machine learning is a useful approach to understanding complex gene expression data to find biomarkers for the diagnosis of OSA.

View Article and Find Full Text PDF

Integrating machine learning with mendelian randomization for unveiling causal gene networks in glioblastoma multiforme.

Discov Oncol

January 2025

Department of Medical Imaging, Shenzhen Longhua District Key Laboratory of Neuroimaging, Shenzhen Longhua District Central Hospital, Shenzhen, 518110, China.

Background: Glioblastoma multiforme (GBM) is a highly aggressive brain cancer with poor prognosis and limited treatment options. Despite advances in understanding its molecular mechanisms, effective therapeutic strategies remain elusive due to the tumor's genetic complexity and heterogeneity.

Methods: This study employed a comprehensive analysis approach integrating 113 machine learning algorithms with Mendelian Randomization (MR) analysis to investigate the molecular underpinnings of GBM.

View Article and Find Full Text PDF

Accurate identification and quantification of 5-hydroxymethylcytosine (5hmC) can help elucidate its function in gene expression and disease pathogenesis. Current 5hmC analysis methods still present challenges, especially for clinical applications, such as having a risk of false-positive results and a lack of sufficient sensitivity. Herein, a 5hmC quantification method for fragment-specific DNA sequences with extreme specificity, high sensitivity, and clinical applicability was established using a quantitative real-time PCR (qPCR)-based workflow through the combination of enzymatic digestion and biological deamination strategy (EDD-5hmC assay).

View Article and Find Full Text PDF

is a significant healthcare-associated pathogen, notable for its diverse virulence and antibiotic resistance profiles. This study aimed to characterize the genotypic and phenotypic diversity of isolates and evaluate their virulence using the model. Biomass production, metabolic activity, capsule formation, and siderophore production were assessed in 27 .

View Article and Find Full Text PDF

Objective: To examine the role and diagnostic potential of miR-421 in prostate cancer (PCa).

Methods: Expression data and clinical information for miR-421 were obtained from the TCGA and Genotype-Tissue Expression (GTEx) databases. Experimental validation was performed at the cellular, blood, and tissue levels to confirm miR-421 expression and its association with clinicopathological features.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!