Identification of robust set of predictive features is one of the most important steps in the construction of clustering, classification and regression models from many thousands of features. Although there have been various attempts to select predictive feature sets from high-dimensional data sets in classification and clustering, there is a limited attempt to study it in regression problems. As semi-supervised and supervised feature selection methods tend to identify noisy features in addition to discriminative variables, unsupervised feature selection methods (USFSMs) are generally regarded as more unbiased approach. Therefore, in this study, along with the entire feature set, four different USFSMs are considered for the quantitative prediction of peptide binding affinities being one of the most challenging post-genome regression problems of very high-dimension comparted to extremely small size of samples. As USFSMs are independent of any predictive method, support vector regression was then utilised to assess the quality of prediction. Given three different peptide binding affinity data sets, the results suggest that the regression performance of USFMs depends generally on the datasets. There is no particular method that yields the best performance compared to their performances in the classification problems. However, a closer investigation of the results appears to suggest that the spectral regression-based approach yields slightly better performance. To the best of our knowledge, this is the first study that presents comprehensive comparison of USFSMs in such high-dimensional regression problems, particularly in biological domain with an application in the prediction of peptide binding affinity, and provides a number of practical suggestions for future practitioners.

Download full-text PDF

Source
http://dx.doi.org/10.1109/EMBC.2015.7320291DOI Listing

Publication Analysis

Top Keywords

regression problems
16
peptide binding
16
feature selection
12
selection methods
12
prediction peptide
12
binding affinity
12
unsupervised feature
8
high-dimensional regression
8
data sets
8
regression
7

Similar Publications

Purpose: Despite the advent of heated tobacco products (HTPs), their relationship to mental health remains unclear. This study aimed to determine associations between the use of combustible cigarettes (CCs) and HTPs with depressive symptoms.

Methods: This descriptive-analytical cross-sectional study was conducted in March 2023.

View Article and Find Full Text PDF

Background: Psychoactive substance use in adults and second-hand smoke (SHS) exposure among children are leading contributors to sleeping problems. Despite this, there is limited data on how these exposures influence sleep patterns in informal settings. Our study assessed the associations between substance use, SHS exposure and sleep disturbances among adults and children in an urban informal settlement in Uganda.

View Article and Find Full Text PDF

Background: Subjective Memory Complaints (SMC) are defined as the perception of one's own memory. In several studies SMC are associated with Alzheimer's disease (AD) neuropathologic changes, and only one study has analyzed and found an association of SMC with other neurodegenerative, but not vascular, neuropathologic changes. Yet, the evidence on the association of SMC with non-AD neuropathologic changes is insufficient.

View Article and Find Full Text PDF

Background: Limbic-predominant age-related TDP-43 encephalopathy neuropathologic change (LATE-NC) is a common cause of dementia in older age. LATE-NC was first coined in 2019 with proposed staging criteria of TDP-43 progressing from amygdala (stage 1), to hippocampus (stage 2), to middle frontal gyrus (stage 3). Criteria were updated in 2023 to further categorize stage 1 to either TDP-43 inclusions in amygdala alone (stage 1a) or hippocampus alone (stage 1b).

View Article and Find Full Text PDF

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

Mayo Clinic, Jacksonville, FL, USA.

Background: Alzheimer's disease (AD) is heterogeneous in both its clinical and neuropathologic course. Age at onset and distribution of corticolimbic tangles can vary widely among individuals. Genetic risk factors APOE ε4 and MAPT H1 increase AD risk.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!