Identification of phenotype-relevant differentially expressed genes in breast cancer demonstrates enhanced quantile discretization protocol's utility in multi-platform microarray data integration.

J Bioinform Comput Biol

1 South African National Bioinformatics Institute/MRC, Unit for Bioinformatics Capacity Development, University of the Western Cape, Private Bag X17, Bellville 7535, South Africa.

Published: October 2016

AI Article Synopsis

  • Microarray experiments in transcriptomics face challenges due to small sample sizes, leading to limited statistical power and potential misinterpretation of expression data.
  • Quantile discretization (QD) helps normalize and combine data from different experiments, but improper selection of bin numbers can obscure true correlations by lumping different expression levels into the same category.
  • The proposed procedure for optimizing bin numbers in dataset analysis revealed previously unnoticed tumorigenesis-related genes and cancer biomarkers in public breast cancer datasets, highlighting the importance of accurate data handling for biomedical research.

Article Abstract

Microarray for transcriptomics experiments often suffer from limited statistical power due to small sample size. Quantile discretization (QD) maps expression values for a sample into a series of equivalently sized 'bins' that represent a discrete numerical range, e.g. [Formula: see text]4 to [Formula: see text]4, which enables normalized data from multiple experiments and/or expression platforms to be combined for re-analysis. We found, however, that informal selection of bin numbers often resulted in loss of the underlying correlation structure in the data through assigning of the same numerical value to genes that are in reality expressed at significantly different levels within a sample. Here we report a procedure for determining an optimal bin number for dataset. Applying this to integrated public breast cancer datasets enabled statistical identification of several differentially expressed tumorigenesis-related genes that were not found when analyzing the individual datasets, and also several cancer biomarkers not previously indicated as having utility in the disease. Notably, differential modulation of translational control and protein synthesis via multiple pathways were found to potentially have central roles in breast cancer development and progression. These findings suggest that our protocol has significant utility in making meaningful novel biomedical discoveries by leveraging the large public expression data repositories.

Download full-text PDF

Source
http://dx.doi.org/10.1142/S0219720016500220DOI Listing

Publication Analysis

Top Keywords

breast cancer
12
differentially expressed
8
quantile discretization
8
[formula text]4
8
identification phenotype-relevant
4
phenotype-relevant differentially
4
expressed genes
4
genes breast
4
cancer
4
cancer demonstrates
4

Similar Publications

Background: Several approaches are being explored for engineering off-the-shelf chimeric antigen receptor (CAR) T cells. In this study, we engineered chimeric Fcγ receptor (FcγR) T cells and tested their potential as a versatile platform for universal T cell therapy.

Methods: Chimeric FcγR (CFR) constructs were generated using three distinct forms of FcγR, namely CD16A, CD32A, and CD64.

View Article and Find Full Text PDF

TP53 germline testing and hereditary cancer: how somatic events and clinical criteria affect variant detection rate.

Genome Med

January 2025

Hereditary Cancer Group, Oncobell Program, Institut d'Investigació Biomèdica de Bellvitge (IDIBELL), Av. Gran Via 199-203, L'Hospitalet del Llobregat, 08908, Spain.

Background: Germline heterozygous pathogenic variants (PVs) in TP53 cause Li-Fraumeni syndrome (LFS), a condition associated with increased risk of multiple tumor types. As the associated cancer risks were refined over time, clinical criteria also evolved to optimize diagnostic yield. The implementation of multi-gene panel germline testing in different clinical settings has led to the identification of TP53 PV carriers outside the classic LFS-associated cancer phenotypes, leading to a broader cancer phenotypic redefinition and to the renaming of the condition as "heritable TP53-related cancer syndrome" (hTP53rc).

View Article and Find Full Text PDF

Background: The beneficial role of physical activity for people living with cancer is well established. However, the importance of physical activity to women living with metastatic breast cancer is not known. As motivations and perceptions around physical activity influence behavioural uptake, a qualitative study was undertaken to explore the motivations and perceptions towards physical activity of this group.

View Article and Find Full Text PDF

Purpose: The purpose of this study was to evaluate the feasibility and safety of indocyanine green (ICG) fluorescence as an alternative to traditional sentinel lymph node biopsy (SLNB) techniques in breast cancer (BC) patients undergoing neoadjuvant chemotherapy (NAC). Specifically, the study aimed to assess sentinel node identification rates and the effectiveness of ICG in axillary staging without the use of radioactive tracers.

Methods: This retrospective study included 71 BC patients treated with NAC, who underwent SLNB using ICG fluorescence between 2020 and 2024.

View Article and Find Full Text PDF

The regressed arms of reversed replication forks exhibit structural similarities to one-ended double-stranded breaks and need to be protected against uncontrolled nucleolytic degradation. Here, we identify MSANTD4 (Myb/SANT-like DNA-binding domain-containing protein 4), a functionally uncharacterized protein that uniquely counters the replication protein A (RPA)-Bloom (BLM)/Werner syndrome helicase (WRN)-DNA replication helicase/nuclease 2 (DNA2) complex to safeguard reversed replication forks from detrimental degradation, independently of the breast cancer susceptibility proteins (BRCA1/2)-DNA repair protein RAD51 pathway. MSANTD4 specifically interacts with the junctions between single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) in DNA substrates harboring a 3' overhang, which resemble the structural features of regressed arms processed by WRN-DNA2.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!