Multilabel learning is a challenging task demanding scalable methods for large-scale data. Feature selection has shown to improve multilabel accuracy while defying the curse of dimensionality of high-dimensional scattered data. However, the increasing complexity of multilabel feature selection, especially on continuous features, requires new approaches to manage data effectively and efficiently in distributed computing environments. This article proposes a distributed model for mutual information (MI) adaptation on continuous features and multiple labels on Apache Spark. Two approaches are presented based on MI maximization, and minimum redundancy and maximum relevance. The former selects the subset of features that maximize the MI between the features and the labels, whereas the latter additionally minimizes the redundancy between the features. Experiments compare the distributed multilabel feature selection methods on 10 data sets and 12 metrics. Results validated through statistical analysis indicate that our methods outperform reference methods for distributed feature selection for multilabel data, while MIM also reduces the runtime in orders of magnitude.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2019.2944298DOI Listing

Publication Analysis

Top Keywords

feature selection
16
continuous features
12
selection continuous
8
multilabel feature
8
features
6
multilabel
6
distributed
5
data
5
distributed selection
4
features multilabel
4

Similar Publications

Prospective validation study of a combined urine and plasma test for predicting high-grade prostate cancer in biopsy naïve men.

Scand J Urol

January 2025

Department of Urology, Odense University Hospital, Odense, Denmark; Academy of Geriatric Cancer Research (AgeCare), Odense University Hospital, Odense, Denmark; Department of Clinical Research, University of Southern Denmark, Odense, Denmark.

Objective: Early and accurate diagnosis of prostate cancer (PC) is crucial for effective treatment. Diagnosing  clinically insignificant cancers can lead to overdiagnosis and overtreatment, highlighting the importance of accurately selecting patients for further evaluation based on improved risk prediction tools. Novel biomarkers offer promise for enhancing this diagnostic process.

View Article and Find Full Text PDF

Background: Clear cell renal cell carcinoma (ccRCC) is the most common subtype of renal cell carcinoma (RCC). Due to the lack of symptoms until advanced stages, early diagnosis of ccRCC is challenging. Therefore, the identification of novel secreted biomarkers for the early detection of ccRCC is urgently needed.

View Article and Find Full Text PDF

Sleep stages classification one of the essential factors concerning sleep disorder diagnoses, which can contribute to many functional disease treatments or prevent the primary cognitive risks in daily activities. In this study, A novel method of mapping EEG signals to music is proposed to classify sleep stages. A total of 4.

View Article and Find Full Text PDF

Developing Mobile Health Applications for Inflammatory Bowel Disease: A Systematic Review of Features and Technologies.

Middle East J Dig Dis

October 2024

Department of Health Information Technology, Ferdows Faculty of Medical Sciences, Birjand University of Medical Sciences, Birjand, Iran.

Background: Patients with inflammatory bowel disease (IBD) require lifelong treatment, which significantly impacts their quality of life. Self-management of this disease is an effective factor in managing chronic conditions and improving patients' quality of life. The use of mobile applications is a novel approach to providing self-management models and healthcare services for patients with IBD.

View Article and Find Full Text PDF

To predict local progression after microwave ablation (MWA) in patients with stage I non-small cell lung cancer (NSCLC), we developed a CT-based radiomics model. Postoperative CT images were used. The intraclass correlation coefficients, two-sample t-test, least absolute shrinkage and selection operator (LASSO) regression, and Pearson correlation analysis were applied to select radiomics features and establish radiomics score.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!