Power and sample size estimation in microarray studies.

BMC Bioinformatics

Division of Personalized Nutrition and Medicine, National Center for Toxicological Research, FDA, Jefferson, AR 72079, USA.

Published: January 2010

Background: Before conducting a microarray experiment, one important issue that needs to be determined is the number of arrays required in order to have adequate power to identify differentially expressed genes. This paper discusses some crucial issues in the problem formulation, parameter specifications, and approaches that are commonly proposed for sample size estimation in microarray experiments. Common methods for sample size estimation are formulated as the minimum sample size necessary to achieve a specified sensitivity (proportion of detected truly differentially expressed genes) on average at a specified false discovery rate (FDR) level and specified expected proportion (pi1) of the true differentially expression genes in the array. Unfortunately, the probability of detecting the specified sensitivity in such a formulation can be low. We formulate the sample size problem as the number of arrays needed to achieve a specified sensitivity with 95% probability at the specified significance level. A permutation method using a small pilot dataset to estimate sample size is proposed. This method accounts for correlation and effect size heterogeneity among genes.

Results: A sample size estimate based on the common formulation, to achieve the desired sensitivity on average, can be calculated using a univariate method without taking the correlation among genes into consideration. This formulation of sample size problem is inadequate because the probability of detecting the specified sensitivity can be lower than 50%. On the other hand, the needed sample size calculated by the proposed permutation method will ensure detecting at least the desired sensitivity with 95% probability. The method is shown to perform well for a real example dataset using a small pilot dataset with 4-6 samples per group.

Conclusions: We recommend that the sample size problem should be formulated to detect a specified proportion of differentially expressed genes with 95% probability. This formulation ensures finding the desired proportion of true positives with high probability. The proposed permutation method takes the correlation structure and effect size heterogeneity into consideration and works well using only a small pilot dataset.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2837028PMC
http://dx.doi.org/10.1186/1471-2105-11-48DOI Listing

Publication Analysis

Top Keywords

sample size
40
size
12
size estimation
12
differentially expressed
12
expressed genes
12
size problem
12
95% probability
12
permutation method
12
small pilot
12
pilot dataset
12

Similar Publications

The role of Anti-PAD4, Anti-CarP, and Anti-RA33 antibodies combined with RF and ACPA in predicting abatacept response in rheumatoid arthritis.

Arthritis Res Ther

January 2025

Department of Medical Science and Public Health, Rheumatology Unit, University of Cagliari, Azienda Ospedaliero Universitaria di Cagliari, SS 554 Monserrato (CA), Bivio Sestu, Monserrato, 09042, Italy.

Objectives: To explore the role of newly emerging autoantibodies (AAbs) - peptidyl-arginine deiminase 4 (aPAD4), carbamylated proteins (aCarP), and anti-RA33 (aRA33) - alongside the traditionally assessed rheumatoid factor (RF) and anti-citrullinated protein antibodies (ACPA), in predicting the response to abatacept (ABT) and its retention rate in rheumatoid arthritis (RA) patients.

Methods: Data from 121 consecutive ABT-treated RA patients were recorded. The RF and ACPA status were retrospectively assessed by reviewing the patients' clinical records.

View Article and Find Full Text PDF

Kinesiophobia and associated factors among people with musculoskeletal disorders in Ethiopia: a multicenter cross-sectional study.

BMC Musculoskelet Disord

January 2025

Departments of Physiotherapy, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.

Introduction: Kinesiophobia has a major health impact on patients with Musculoskeletal disorders (MSDs) in their functional and physical activities, which leads to poor outcomes, loss of motivation, loss of mobility, and decreased quality of life. Despite the burden of kinesiophobia among MSDs, there is limited evidence about the burden of kinesiophobia in Ethiopia. Thus, this study aimed to assess the prevalence and its associated factors of kinesiophobia among MSD patients attending physiotherapy outpatient clinics.

View Article and Find Full Text PDF

Deep learning classification models based on Convolutional Neural Networks (CNNs) are increasingly used in population genetic inference for detecting signatures of natural selection. Prevailing detection methods treat the design of the classifier as a discrete phase, assuming that high classification accuracy is the sole prerequisite for precise detection. This frequently steers method development toward classification-driven optimizations that can inadvertently impede detection.

View Article and Find Full Text PDF

Background: Cutaneous tuberculosis is a rare form of the disease that defies diagnosis due to the diversity of clinical presentations. This study was based on the reality of a dermatology referral center in the Brazilian Amazon region to detail several characteristics of this disease.

Objective: To describe a series of cases of cutaneous tuberculosis treated at a dermatology service in the Brazilian Amazon region, addressing epidemiological aspects, clinical forms, diagnostic methods, treatment, and outcomes.

View Article and Find Full Text PDF

Background: Early-onset dementia (EOD) and late-onset dementia (LOD) may have distinct modifiable risk-factor profiles.

Objective: To identify and compare factors associated with EOD and LOD using a nationwide cohort database.

Design: Nationwide two nested case-control studies.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!