Bayesian model accounting for within-class biological variability in Serial Analysis of Gene Expression (SAGE).

BMC Bioinformatics

Statistics Department, Instituto de Matemática e Estatística - Universidade de São Paulo, Rua do Matão 1010, 05508-090 São Paulo, Brazil.

Published: August 2004

Background: An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error.

Results: We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it.

Conclusion: Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC517707PMC
http://dx.doi.org/10.1186/1471-2105-5-119DOI Listing

Publication Analysis

Top Keywords

within-class variability
12
bayesian model
8
serial analysis
8
analysis gene
8
gene expression
8
expression sage
8
variability
6
model accounting
4
within-class
4
accounting within-class
4

Similar Publications

Bio-informatics and gene expression analysis face major hurdles when dealing with high-dimensional data, where the number of variables or genes much outweighs the number of samples. These difficulties are exacerbated, particularly in microarray data processing, by redundant genes that do not significantly contribute to the response variable. To address this issue, gene selection emerges as a feasible method for identifying the most important genes, hence reducing the generalization error of classification algorithms.

View Article and Find Full Text PDF

Background: Annual Chinese National negotiations for including innovative drugs in the National Reimbursement Drug List (NRDL) reveal an increasing number of new drugs with overlapping action mechanisms of action and similar indications. Yet, it is unclear if competition affects reimbursement decisions. Thus, we explored the impact of competition on reimbursement decisions for cancer drugs in China.

View Article and Find Full Text PDF

Objectives: Most research in human dental age estimation has focused on point estimates of age, and most research on dental development theories has focused on morphology or eruption. Correlations between developing teeth using ordinal staging have received less attention. The effect of demographic variables on these correlations is unknown.

View Article and Find Full Text PDF

Background: Various classification, class modeling, and clustering techniques operate within abstract spaces, utilizing Principal Components (e.g., Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA)) or latent variable spaces (e.

View Article and Find Full Text PDF
Article Synopsis
  • The study investigated the distribution of genes related to integrons and extended spectrum beta-lactamases (ESBLs) in urinary tract infection (UTI) isolates from patients in Kerman, Iran.
  • Antibiotic tests showed that imipenem was the most effective treatment, but a significant 40.2% of isolates produced ESBLs, indicating resistance.
  • The research found different types of integrons among the isolates, with class I integrons being the most common, and identified 16 distinct genetic clusters of the UPEC isolates.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!