The Integrative Cluster subtypes (IntClusts) provide a framework for the classification of breast cancer tumors into 10 distinct groups based on copy number and gene expression, each with unique biological drivers of disease and clinical prognoses. Gene expression data is often lacking, and accurate classification of samples into IntClusts with copy number data alone is essential. Current classification methods achieve low accuracy when gene expression data are absent, warranting the development of new approaches to IntClust classification. Copy number data from 1980 breast cancer samples from METABRIC was used to train multiclass XGBoost machine learning algorithms (CopyClust). A piecewise constant fit was applied to the average copy number profile of each IntClust and unique breakpoints across the 10 profiles were identified and converted into ~ 500 genomic regions used as features for CopyClust. These models consisted of two approaches: a 10-class model with the final IntClust label predicted by a single multiclass model and a 6-class model with binary reclassification in which four pairs of IntClusts were combined for initial multiclass classification. Performance was validated on the TCGA dataset, with copy number data generated from both SNP arrays and WES platforms. CopyClust achieved 81% and 79% overall accuracy with the TCGA SNP and WES datasets, respectively, a nine-percentage point or greater improvement in overall IntClust subtype classification accuracy. CopyClust achieves a significant improvement over current methods in classification accuracy of IntClust subtypes for samples without available gene expression data and is an easily implementable algorithm for IntClust classification of breast cancer samples with copy number data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11126405PMC
http://dx.doi.org/10.1038/s41598-024-62724-6DOI Listing

Publication Analysis

Top Keywords

copy number
24
breast cancer
16
gene expression
16
number data
16
expression data
12
classification
9
machine learning
8
integrative cluster
8
classification breast
8
intclust classification
8

Similar Publications

Novel antigenic variant strains of the infectious bursal disease virus (IBDV) classified into genogroup A2d have been found in the western part of Japan since 2017. Novel antigenic variant IBDVs now occur in higher frequencies in poultry houses and have been detected in the eastern part of Japan, indicating the spread of IBDVs despite the usual IBDV vaccination. We isolated a novel antigenic variant IBDV, designated as the B2977CE2C3 strain.

View Article and Find Full Text PDF

Development of Multiplex Assays for the Identification of Zoonotic Species.

Pathogens

December 2024

Intracellular Pathogens Research Laboratory, Comparative Medicine Institute, College of Veterinary Medicine, North Carolina State University, Raleigh, NC 27606, USA.

More than one-hundred species that affect animals and humans have been described, eight of which have been associated with emerging and underdiagnosed zoonoses. Most diagnostic studies in humans have used serology or molecular assays based on the 18S rRNA gene. Because the 18S rRNA gene is highly conserved, obtaining an accurate diagnosis at the species level is difficult, particularly when the amplified DNA fragment is small.

View Article and Find Full Text PDF

Mapping Antimalarial Drug Resistance in Mozambique: A Systematic Review of Genetic Markers Post-ACT Implementation.

Int J Mol Sci

December 2024

Global Health and Tropical Medicine (GHTM), Associate Laboratory in Translation and Innovation Towards Global Health (LA-REAL), Instituto de Higiene e Medicina Tropical (IHMT), Universidade NOVA de Lisboa (UNL), Rua da Junqueira 100, 1349-008 Lisboa, Portugal.

Malaria continues to be a significant public health burden in many tropical and subtropical regions. Mozambique ranks among the top countries affected by malaria, where it is a leading cause of morbidity and mortality, accounting for 29% of all hospital deaths in the general population and 42% of deaths amongst children under five. This review presents a comparative analysis of data on five critical genes associated with antimalarial drug resistance: , , , , and , along with the copy number variation (CNV) in genes and .

View Article and Find Full Text PDF

Clinical and Genetic Characterization of Adolescent-Onset Epilepsy: A Single-Center Experience in Republic of Korea.

Biomedicines

November 2024

Department of Laboratory Medicine, College of Medicine, Jeonbuk National University, Jeonju 54907, Republic of Korea.

Objectives: This study investigated the characteristics of adolescent-onset epilepsy (AOE) and conducted genetic tests on a cohort of 76 Korean patients to identify variants and expand the spectrum of mutations associated with AOE.

Methods: Clinical exome sequencing after routine karyotyping and chromosomal microarray was performed to identify causative variants and expand the spectrum of mutations associated with AOE.

Results: In cases of AOE without neurodevelopmental delay (NDD), this study identified four likely pathogenic variants (LPVs) or variants of uncertain significance (VUS) and two copy number variations (CNVs).

View Article and Find Full Text PDF

Pan-Cancer Analysis Reveals the Potential of PLOD1 as a Prognostic and Immune Biomarker for Human Cancer.

Biomedicines

November 2024

Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Urology, Peking University Cancer Hospital & Institute, Beijing 100089, China.

Procollagen-lysine, 2-oxoglutarate 5-dioxygenase 1 (PLOD1) is known as an enhancer of collagen fiber deposition and cross-linking stability. However, there is limited information on its function in tumors. In this study, we aimed to elucidate the function and potential mechanism of action of PLOD1 across cancers.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!