High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses.

BMC Bioinformatics

Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation 825 NE 13th Street, Oklahoma City, Oklahoma 73104-5005, USA.

Published: October 2011

Background: Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses.

Methods: We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses.

Results: 13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data.

Conclusions: Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem.Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236842PMC
http://dx.doi.org/10.1186/1471-2105-12-S10-S2DOI Listing

Publication Analysis

Top Keywords

microarray data
20
gene expression
8
meta-analytic approaches
8
sample size
8
gds files
8
co-expression patterns
8
13000 experiments
8
microarray
7
data
6
experiments
5

Similar Publications

A stable combination of non-stable genes outperforms standard reference genes for RT-qPCR data normalization.

Sci Rep

December 2024

Laboratoire de Recherche en Sciences Végétales, Equipe Génomique et Biotechnologie des Fruits, UMR 5546, CNRS, UPS, Toulouse INP, Université de Toulouse, Toulouse, France.

Gene expression profiling is of key importance in all domains of life sciences, as medicine, environment, and plants, for both basic and applied research. Despite the emergence of microarrays and high-throughput sequencing, qPCR remains a standard method for gene expression analyses, with its data normalization step being crucial for ensuring accuracy. Currently, the most widely used normalization method is based on the use of reference genes, assumed to be stably expressed across all experimental conditions.

View Article and Find Full Text PDF

CD47 interacts with signal regulatory protein alpha (SIRPα) on macrophages to deliver an anti-phagocytic signal, enabling tumor cells to evade immune destruction. This study explores the relationship between CD47 and SIRPα expression and key clinical prognostic factors, microvascular density (MVD), and tumor-infiltrating lymphocytes (TIL) in Diffuse Large B Cell Lymphoma (DLBCL) cases. We analyzed tissue samples from 122 DLBCL cases using tissue microarray (TMA) blocks and immunohistochemical staining for CD47, SIRPα, CD31, and CD3.

View Article and Find Full Text PDF

A Neutrophil Extracellular Traps-Related Signature Predicts Clinical Outcomes and Identifies Immune Landscape in Ovarian Cancer.

J Cell Mol Med

December 2024

Department of Gynecology, School of Medicine, Shanghai First Maternity and Infant Hospital, Tongji University, Shanghai, China.

Ovarian cancer (OvCa) is the most lethal gynaecology malignancies worldwide. Neutrophil extracellular traps (NETs), net-like protein structures produced by activated neutrophils and DNA-histone complexes, have a central role in tumours, though haven't been fully explored in OvCa. We obtained transcriptome data from TCGA-OvCa database (n = 376) as training, ICGC-OvCa database (n = 111) as validation and GTEx database (n = 180) as controls.

View Article and Find Full Text PDF

Therapeutic role of aripiprazole in cartilage defects explored through a drug repurposing approach.

Sci Rep

December 2024

Department of Orthopaedic Surgery, CHA Bundang Medical Center, CHA University School of Medicine, 335 Pangyo-ro, Bundang-gu, Seongnam-si, Gyeonggi-do, 13488, Republic of Korea.

Articular cartilage has a limited regenerative capacity, resulting in poor spontaneous healing of damaged tissue. Despite various scientific efforts to enhance cartilage repair, no single method has yielded satisfactory results. With rising drug development costs, drug repositioning has emerged as a viable alternative.

View Article and Find Full Text PDF

AI-driven feature selection and epigenetic pattern analysis: A screening strategy of CpGs validated by pyrosequencing for body fluid identification.

Forensic Sci Int

December 2024

Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou 510515, China. Electronic address:

Identification of body fluid stain at crime scene is one of the important tasks of forensic evidence analysis. Currently, body fluid-specific CpGs detected by DNA methylation microarray screening, have been widely studied for forensic body fluid identification. However, some CpGs have limited ability to distinguish certain body fluid types.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!