A novel importance scores based variable selection approach and validation using a MIR and NIR dataset.

Spectrochim Acta A Mol Biomol Spectrosc

School of Pharmacy and Food Engineering, Wuyi University, Jiangmen 529020, PR China. Electronic address:

Published: January 2025

Variable selection is important in spectral analysis for improving interpretation quality and accuracy. This study introduces a novel variable selection process, named "VMHBSC", which consists of six steps, with each letter representing one step. To demonstrate its process and advantages, two datasets were employed, a mid-infrared spectral (MIR) dataset (234 × 7468, sample number × variables) of Chenpi samples (a traditional Chinese medicinal material derived from the dried peel of mature tangerines) and a near-infrared spectral (NIR) dataset (16000 × 256) for modeling competition. In the MIR dataset, VMHBSC selected 3 important variables from all 7468 variables, and models established using Decision Trees (DT), Gradient Boosting Decision Tree (GBDT), and Extreme Gradient Boosting (XGBoost) achieved higher accuracy compared to models using other variable selection methods. For the NIR dataset, VMHBSC selected 24 important variables from all 256 variables. Based on these 24 common variables, three hybrid models (VMHBSC-DT, VMHBSC-GBDT and VMHBSC-XGBoost) were also established and shown stable performance. These findings indicate the effectiveness of the VMHBSC process in enhancing model performance and robustness.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.saa.2025.125701DOI Listing

Publication Analysis

Top Keywords

variable selection
16
nir dataset
12
mir dataset
8
dataset vmhbsc
8
vmhbsc selected
8
selected variables
8
gradient boosting
8
dataset
5
variables
5
novel scores
4

Similar Publications

Background: Salbutamol, a short-acting β-agonist used in asthma treatment, is available in multiple formulations, including inhalers, nebulizers, oral tablets, and intravenous, intramuscular, and subcutaneous routes. Each formulation exhibits distinct pharmacokinetic (PK) and pharmacodynamic (PD) profiles, influencing therapeutic outcomes and adverse effects. Although asthma management predominantly relies on inhaled salbutamol, understanding how these formulations interact with patient-specific characteristics could improve personalized medicine approaches, potentially uncovering the therapeutic benefits of alternative formulations for an individual patient.

View Article and Find Full Text PDF

In the Mediterranean basin, urban forests are widely recognized as essential landscape components, playing a key role in nature-based solutions by enhancing environmental quality and providing a range of ecosystem services. The selection of woody plant species for afforestation and reforestation should prioritize native species that align with the biogeographical and ecological characteristics of the planting sites. Among these, L.

View Article and Find Full Text PDF

- Essential Oil: Chemical Composition, Phytotoxic Activity and Environmental Safety.

Plants (Basel)

January 2025

Department of Biology, Faculty of Humanities and Natural Sciences, University of Prešov, 17 Novembra 1, 08001 Prešov, Slovakia.

Weeds cause a decrease in the quantity and quality of agricultural production and economic damage to producers. The prolonged use of synthetic pesticides causes problems of environmental pollution, the possible alteration of agricultural products and problems for human health. For this reason, the scientific community's search for products of natural origin, which are biodegradable, safe for human health and can act as valid alternatives to traditional herbicides, is growing.

View Article and Find Full Text PDF

Exploring the changes in plant functional traits and their relationship with the environment in karst climax communities across different latitudes can enhance our understanding of how these communities respond to environmental gradients. In this study, we focus on climax karst climax plant communities in Guizhou Province, China. We selected three sample sites located at varying latitudes and analyzed the variations in functional traits of the plant communities at these latitudes.

View Article and Find Full Text PDF

Structure of Plant Populations in Constructed Wetlands and Their Ability for Water Purification.

Plants (Basel)

January 2025

Core Botanical Gardens/Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China.

In constructed wetlands (CWs) with multiple plant communities, population structure may change over time and these variations may ultimately influence water quality. However, in CWs with multiple plant communities, it is still unclear how population structure may change over time and how these variations ultimately influence water quality. Here, we established a CW featuring multiple plant species within a polder to investigate the variation in plant population structure and wastewater treatment effect for drainage water over the course of one year.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!