A filter feature selection technique has been widely used to mine biomedical data. Recently, in the classical filter method minimal-Redundancy-Maximal-Relevance (mRMR), a risk has been revealed that a specific part of the redundancy, called irrelevant redundancy, may be involved in the minimal-redundancy component of this method. Thus, a few attempts to eliminate the irrelevant redundancy by attaching additional procedures to mRMR, such as Kernel Canonical Correlation Analysis based mRMR (KCCAmRMR), have been made. In the present study, a novel filter feature selection method based on the Maximal Information Coefficient (MIC) and Gram-Schmidt Orthogonalization (GSO), named Orthogonal MIC Feature Selection (OMICFS), was proposed to solve this problem. Different from other improved approaches under the max-relevance and min-redundancy criterion, in the proposed method, the MIC is used to quantify the degree of relevance between feature variables and target variable, the GSO is devoted to calculating the orthogonalized variable of a candidate feature with respect to previously selected features, and the max-relevance and min-redundancy can be indirectly optimized by maximizing the MIC relevance between the GSO orthogonalized variable and target. This orthogonalization strategy allows OMICFS to exclude the irrelevant redundancy without any additional procedures. To verify the performance, OMICFS was compared with other filter feature selection methods in terms of both classification accuracy and computational efficiency by conducting classification experiments on two types of biomedical datasets. The results showed that OMICFS outperforms the other methods in most cases. In addition, differences between these methods were analyzed, and the application of OMICFS in the mining of high-dimensional biomedical data was discussed. The Matlab code for the proposed method is available at https://github.com/lhqxinghun/bioinformatics/tree/master/OMICFS/.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiomed.2017.08.021DOI Listing

Publication Analysis

Top Keywords

feature selection
20
filter feature
16
biomedical data
12
irrelevant redundancy
12
selection method
8
method based
8
based maximal
8
maximal coefficient
8
gram-schmidt orthogonalization
8
additional procedures
8

Similar Publications

It has become increasingly evident that the conformational distributions of intrinsically disordered proteins or regions are strongly dependent on their amino acid compositions and sequence. To facilitate a systematic investigation of these sequence-ensemble relationships, we selected a set of 16 naturally occurring intrinsically disordered regions of identical length but with large differences in amino acid composition, hydrophobicity, and charge patterning. We probed their conformational ensembles with single-molecule Förster resonance energy transfer (FRET), complemented by circular dichroism (CD) and nuclear magnetic resonance (NMR) spectroscopy as well as small-angle X-ray scattering (SAXS).

View Article and Find Full Text PDF

Targeting iron metabolism has emerged as a novel therapeutic strategy for the treatment of cancer. As such, iron chelator drugs are repurposed or specifically designed as anticancer agents. Two important chelators, deferasirox (Def) and triapine (Trp), attack the intracellular supply of iron (Fe) and inhibit Fe-dependent pathways responsible for cellular proliferation and metastasis.

View Article and Find Full Text PDF

Objective: A comprehensive bioinformatics analysis was conducted to investigate potential new diagnostic biomarkers and immune infiltration characteristics associated with tubulointerstitial injury in lupus nephritis (LN), and to examine possible correlations between key genes and infiltrating immune cells.

Methods: The GSE32591, GSE113342, and GSE200306 datasets were downloaded from the Gene Expression Omnibus database and differentially expressed genes (DEGs) were identified in the pooled dataset. Support vector machine-recursive feature elimination analysis and the least absolute shrinkage and selection operator regression model were used to screen for possible markers, and the compositional patterns of the 22 types of immune cell fractions in LN were determined using CIBERSORT.

View Article and Find Full Text PDF

Prognostic and Predictive Biomarkers of Oligometastatic NSCLC: New Insights and Clinical Applications.

JTO Clin Res Rep

December 2024

Department of Pulmonary Diseases, GROW Research Institute for Oncology and Reproduction, Maastricht University Medical Center+, Maastricht, The Netherlands.

This review discusses the current data on predictive and prognostic biomarkers in oligometastatic NSCLC and discusses whether biomarkers identified in other stages and widespread metastatic disease can be extrapolated to the oligometastatic disease (OMD) setting. Research is underway to explore the prognostic and predictive value of biological attributes of tumor tissue, circulating cells, the tumor microenvironment, and imaging findings as biomarkers of oligometastatic NSCLC. Biomarkers that help define true OMD and predict outcomes are needed for patient selection for oligometastatic treatment, and to avoid futile treatments in patients that will not benefit from locoregional treatment.

View Article and Find Full Text PDF

Background: Coronary artery bypass grafting (CABG) surgery has been a widely accepted method for treating coronary artery disease. However, its postoperative complications can have a significant effect on long-term patient outcomes. A retrospective study was conducted to identify before and after surgery that contribute to postoperative stroke in patients undergoing CABG, and to develop predictive models and recommendations for single-factor thresholds.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!