Wavelength selection method for near-infrared spectroscopy based on the combination of mutual information and genetic algorithm.

Talanta

Daqing Oilfield Shale Oil Exploration and Development Headquarters, Daqing, 163455, China.

Published: January 2025

Near-infrared (NIR) spectroscopy analysis technology has become a widely utilized analytical tool in various fields due to its convenience and efficiency. However, with the promotion of instrument precision, the spectral dimension can now be expanded to include hundreds of dimensions. This expansion results in time-consuming modeling processes and a decrease in model performance. Hence, it is crucial to carefully choose representative features before constructing models. This paper focuses on the limitations of filter algorithms, which can only sort features and cannot directly determine the best subset of features. A hybrid method of combination of the Max-Relevance Min-Redundancy (mRMR) algorithm and the Genetic Algorithm (GA), as well as filter and wrapper feature selection methods, are combined to select appropriate features automatically. This hybrid algorithm retains the features in each individual that are considered to have a strong correlation and low redundancy by the mRMR algorithms during each iteration of the GA. On the other hand, it deletes the features that are regarded as having little correlation or high redundancy. Through the process of iteration, the feature subset is continuously optimized. We use the proposed hybrid method to select features on two datasets and establish various models to verify our proposed method in this paper. The experimental results indicate the feature selection approach, which combines mRMR with the GA, covers the advantages of both feature selection methods. This approach can select features that show good predictive performance. When compared with other common feature selection methods, such as the Uninformative Variable Elimination algorithm (UVE), Competitive Adaptive Reweighted Sampling algorithm (CARS), Successive Projections Algorithm (SPA), Iteratively Retains Informative Variables (IRIV), and GA, the hybrid algorithm can select a larger number of feature variables that are both representative and informative, additionally, it significantly enhances the predictive performance of the model.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.talanta.2025.127573DOI Listing

Publication Analysis

Top Keywords

feature selection
16
selection methods
12
algorithm
8
genetic algorithm
8
features
8
hybrid method
8
hybrid algorithm
8
select features
8
predictive performance
8
feature
6

Similar Publications

Background: Advanced gastric cancer (GC) exhibits a high recurrence rate and a dismal prognosis. Myocyte enhancer factor 2c (MEF2C) was found to contribute to the development of various types of cancer. Therefore, our aim is to develop a prognostic model that predicts the prognosis of GC patients and initially explore the role of MEF2C in immunotherapy for GC.

View Article and Find Full Text PDF

Amyotrophic lateral sclerosis (ALS) lacks a specific biomarker, but is defined by relatively selective toxicity to motor neurons (MN). As others have highlighted, this offers an opportunity to develop a sensitive and specific biomarker based on detection of DNA released from dying MN within accessible biofluids. Here we have performed whole genome bisulfite sequencing (WGBS) of iPSC-derived MN from neurologically normal individuals.

View Article and Find Full Text PDF

Background: Zinc finger homeodomain (ZF-HD) belongs to the plant-specific transcription factor (TF) family and is widely involved in plant growth, development and stress responses. Despite their importance, a comprehensive identification and analysis of ZF-HD genes in the soybean (Glycine max) genome and their possible roles under abiotic stress remain unexplored.

Results: In this study, 51 ZF-HD genes were identified in the soybean genome that were unevenly distributed on 17 chromosomes.

View Article and Find Full Text PDF

Proximal femoral fractures in children are challenging in clinical treatment due to their unique anatomical and biomechanical characteristics. The distribution and characteristics of fracture lines directly affect the selection of treatment options and prognosis. Pediatric proximal femur fractures exhibit distinctive features, with the distribution and characteristics of the fracture line playing a crucial role in deciding optimal treatment.

View Article and Find Full Text PDF

Purpose: This analysis of the CIRSE Registry for SIR-Spheres Therapy in France, CIRT-FR, reports on real-world outcomes of transarterial radioembolisation (TARE) with Y90 resin microspheres for hepatocellular carcinoma (HCC) and colorectal cancer liver metastases (CRLM) patients in France, focusing on safety, effectiveness and health-related quality of life (HRQoL). Results on patients treated based on national reimbursement criteria are discussed here.

Methods: Prospective, multicentre, observational study of HCC and CRLM patients treated between August 2017 and July 2020 with TARE Y90 resin microspheres.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!