Subspace learning using low-rank latent representation learning and perturbation theorem: Unsupervised gene selection.

Comput Biol Med

Faculty of Engineering, Computing and the Environment, Kingston University, Penrhyn Road Campus, Kingston Upon Thames, London, KT1 2EE, UK.

Published: December 2024

In recent years, gene expression data analysis has gained growing significance in the fields of machine learning and computational biology. Typically, microarray gene datasets exhibit a scenario where the number of features exceeds the number of samples, resulting in an ill-posed and underdetermined equation system. The presence of redundant features in high-dimensional data leads to suboptimal performance and increased computational time for learning algorithms. Although feature extraction and feature selection are two approaches that can be employed to deal with this challenge, feature selection has greater interpretability ability which causes it to receive more attention. In this study, we propose an unsupervised feature selection which is based on pseudo label latent representation learning and perturbation theory. In the first step, pseudo labels are extracted and constructed using latent representation learning. In the second step, the least square problem is solved for original data matrix and perturbed data matrix. Features are clustered based on the similarity between the original data matrix and the perturbed data matrix using k-means. In the last step, features in each subcluster are ranked based on information gain criterion. To showcase the efficacy of the proposed approach, numerical experiments were carried out on six benchmark microarray datasets and two RNA-Sequencing benchmark datasets. The outcomes indicate that the proposed technique surpasses eight state-of-the-art unsupervised feature selection methods in both clustering accuracy and normalized mutual information.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiomed.2024.109567DOI Listing

Publication Analysis

Top Keywords

feature selection
16
data matrix
16
latent representation
12
representation learning
12
learning perturbation
8
unsupervised feature
8
original data
8
matrix perturbed
8
perturbed data
8
data
6

Similar Publications

Per- and polyfluoroalkyl substances (PFASs) have recently garnered considerable concerns regarding their impacts on human and ecological health. Despite the important roles of polyamide membranes in remediating PFASs-contaminated water, the governing factors influencing PFAS transport across these membranes remain elusive. In this study, we investigate PFAS rejection by polyamide membranes using two machine learning (ML) models, namely XGBoost and multimodal transformer models.

View Article and Find Full Text PDF

There is a pressing need to improve risk stratification and treatment selection for HPV-negative head and neck squamous cell carcinoma (HNSCC) due to the adverse side effects of treatment. One of the most important prognostic features is lymph nodes involvement. Previously, we demonstrated that tumor formation in patient-derived xenografts (i.

View Article and Find Full Text PDF

Modern maize (Zea mays ssp. mays) was domesticated from Teosinte parviglumis (Zea mays ssp. parviglumis), with subsequent introgressions from Teosinte mexicana (Zea mays ssp.

View Article and Find Full Text PDF

The emergence of single-atom catalysts offers exciting prospects for the green production of hydrogen peroxide; however, their optimal local structure and the underlying structure-activity relationships remain unclear. Here we show trace Fe, up to 278 mg/kg and derived from microbial protein, serve as precursors to synthesize a variety of Fe single-atom catalysts containing FeNO (1 ≤ x ≤ 4) moieties through controlled pyrolysis. These moieties resemble the structural features of nonheme Fe-dependent enzymes while being effectively confined on a microbe-derived, electrically conductive carbon support, enabling high-current density electrolysis.

View Article and Find Full Text PDF

Steering acidic oxygen reduction selectivity of single-atom catalysts through the second sphere effect.

Nat Commun

December 2024

Center of Artificial Photosynthesis for Solar Fuels and Department of Chemistry, School of Science, Westlake University, Hangzhou, China.

Natural enzymes feature distinctive second spheres near their active sites, leading to exquisite catalytic reactivity. However, incumbent synthetic strategies offer limited versatility in functionalizing the second spheres of heterogeneous catalysts. Here, we prepare an enzyme-mimetic single Co-N atom catalyst with an elaborately configured pendant amine group in the second sphere via 1,3-dipolar cycloaddition, which switches the oxygen reduction reaction selectivity from the 4e to the 2e pathway under acidic conditions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!