Gene selection for microarray data classification via subspace learning and manifold regularization.

Med Biol Eng Comput

Department of Pharmacy, People's Hospital of Lian'shui County, Huai'an, Jiangsu, 223300, People's Republic of China.

Published: July 2018

AI Article Synopsis

  • Rapid advancements in DNA microarray technology have led to significant genomic data, posing challenges for classifying data due to the imbalance between the large number of genes and smaller sample sizes.
  • A new gene selection method is introduced to eliminate irrelevant and redundant genes, enhancing the classification process by focusing on a more relevant subset of genes.
  • The proposed technique is framed as a manifold regularized subspace learning problem, utilizing a projection matrix for dimensionality reduction while maintaining the original data's local structure, achieving superior results in classification tasks compared to existing methods.

Article Abstract

With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s11517-017-1751-6DOI Listing

Publication Analysis

Top Keywords

microarray data
20
gene selection
12
data
10
data classification
8
subspace learning
8
original data
8
projection matrix
8
graphical abstract
8
microarray
7
genes
6

Similar Publications

Purpose The present study aimed to clarify the distribution pattern of carcinoma associated fibroblasts (CAFs) across pancreatic ductal adenocarcinoma (PDAC) and its prognostic prediction value. Methods Data of two cohorts were retrospectively collected from consecutive patients who underwent primary pancreatic resection from January 2015 to December 2017. We used tumor specimens to screen out the most suitable markers for the spatial distribution analysis for CAFs subpopulations.

View Article and Find Full Text PDF

Background And Aims: Cadherins are adhesion proteins, and their dysregulation may result in the development of atherosclerosis, plaque rupture, or lesions of the vascular wall. The aim of the present study was to detect the associations of cadherins-P, -E, and -H, with atherosclerosis and pathological cardiovascular conditions.

Methods And Results: The present study with 3-year follow up evaluated atherosclerosis and fasting levels of P-, E-, and H-cadherins in the serum samples of 214 patients in a hospital setting.

View Article and Find Full Text PDF

Pooled microarray expression analysis of failing left ventricles reveals extensive cellular-level dysregulation independent of age and sex.

J Mol Cell Cardiol Plus

March 2024

National Coalition of Independent Scholars, 125 Putney Road, Battleboro, VT 05301, United States.

Existing cardiovascular studies tend to suffer from small sample sizes and unaddressed confounders. Re-profiling of 9 microarray datasets revealed significant global gene expression differences between 358 failing and 191 non-failing left ventricles independent of age and sex ( = 5.1e-10).

View Article and Find Full Text PDF

Objective: To investigate the expression patterns of Nectin-4, the target molecule of the antibody-drug conjugate enfortumab vedotin (EV), in relation to histological and molecular subtypes of urothelial bladder cancer (UBC).

Patients And Methods: We assessed the protein expression patterns of Nectin-4 in a spatially organised tissue microarray containing 1386 tissue cores from 314 consecutive patients with UBC who underwent radical cystectomy (2005-2018). Results were correlated with clinicopathological and follow-up data, as well as with different spatial locations (tumour central vs tumour-normal interface and primary tumour vs lymph node [LN] metastases).

View Article and Find Full Text PDF

Melanoma antigen gene-A2 (MAGE-A2) is one of the most cancer-testis antigens overexpressed in a variety of malignancies. However, the expression of MAGE-A2 for clinical values in the pathophysiology of renal cell carcinoma (RCC) is unknown. For the first time, the present study was conducted to examine the expression and prognostic significance of MAGE-A2 expression in clear cell RCC (ccRCC).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!