Epistemological issues in omics and high-dimensional biology: give the people what they want.

Physiol Genomics

Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, Alabama 35294, USA.

Published: December 2006

Gene expression microarrays have been the vanguard of new analytic approaches in high-dimensional biology. Draft sequences of several genomes coupled with new technologies allow study of the influences and responses of entire genomes rather than isolated genes. This has opened a new realm of highly dimensional biology where questions involve multiplicity at unprecedented scales: thousands of genetic polymorphisms, gene expression levels, protein measurements, genetic sequences, or any combination of these and their interactions. Such situations demand creative approaches to the processes of inference, estimation, prediction, classification, and study design. Although bench scientists intuitively grasp the need for flexibility in the inferential process, the elaboration of formal supporting statistical frameworks is just at the very start. Here, we will discuss some of the unique statistical challenges facing investigators studying high-dimensional biology, describe some approaches being developed by statistical scientists, and offer an epistemological framework for the validation of proffered statistical procedures. A key theme will be the challenge in providing methods that a statistician judges to be sound and a biologist finds informative. The shift from family-wise error rate control to false discovery rate estimation and to assessment of ranking and other forms of stability will be portrayed as illustrative of approaches to this challenge.

Download full-text PDF

Source
http://dx.doi.org/10.1152/physiolgenomics.00095.2006DOI Listing

Publication Analysis

Top Keywords

high-dimensional biology
12
gene expression
8
epistemological issues
4
issues omics
4
omics high-dimensional
4
biology
4
biology people
4
people gene
4
expression microarrays
4
microarrays vanguard
4

Similar Publications

HIV infection implicates a spectrum of tissues in the human body starting with viral transmission in the anogenital tract and subsequently persisting in lymphoid tissues and brain. Though studies using isolated cells have contributed significantly towards our understanding of HIV infection, the tissue microenvironment is characterised by a complex interplay of a range of factors, all of which can influence the course of infection but are otherwise missed in ex vivo studies. To address this knowledge gap, it is necessary to investigate the dynamics of infection and the host immune response in situ using imaging-based approaches.

View Article and Find Full Text PDF

Manifold learning techniques have emerged as crucial tools for uncovering latent patterns in high-dimensional single-cell data. However, most existing dimensionality reduction methods primarily rely on 2D visualization, which can distort true data relationships and fail to extract reliable biological information. Here, we present DTNE (diffusive topology neighbor embedding), a dimensionality reduction framework that faithfully approximates manifold distance to enhance cellular relationships and dynamics.

View Article and Find Full Text PDF

Advances in next-generation sequencing technology have enabled the high-throughput profiling of metagenomes and accelerated microbiome studies. Recently, there has been a rise in quantitative studies that aim to decipher the microbiome co-occurrence network and its underlying community structure based on metagenomic sequence data. Uncovering the complex microbiome community structure is essential to understanding the role of the microbiome in disease progression and susceptibility.

View Article and Find Full Text PDF

Single-cell RNA sequencing (scRNA-seq) offers remarkable insights into cellular development and differentiation by capturing the gene expression profiles of individual cells. The role of dimensionality reduction and visualization in the interpretation of scRNA-seq data has gained widely acceptance. However, current methods face several challenges, including incomplete structure-preserving strategies and high distortion in embeddings, which fail to effectively model complex cell trajectories with multiple branches.

View Article and Find Full Text PDF

Background: Peripheral nerve sheath tumors (PNSTs) encompass entities with different cellular differentiation and degrees of malignancy. Spatial heterogeneity complicates diagnosis and grading of PNSTs in some cases. In malignant PNST (MPNST) for example, single cell sequencing data has shown dissimilar differentiation states of tumor cells.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!