Calibration set reduction by the selection of a subset containing the best fitting samples showing optimally predictive ability.

Talanta

Department of Analytical Chemistry, Applied Chemometrics and Molecular Modelling, Vrije Universiteit Brussel-VUB, Laarbeeklaan 103, B-1090, Brussels, Belgium.

Published: January 2024

Near-infrared (NIR) spectroscopy is a rapid, non-invasive and cost-effective technique, for which sample pre-treatment is often not required. It is applied for both qualitative and quantitative analyses in various application fields. Often, large calibration sets are used, from which informative subsets can be selected without a loss of meaningful information. In this study, a new approach for sample subset selection is proposed and evaluated. The global PLS model, obtained with the original large global calibration set after FCAM-SIG variable selection, is used for the selection of the best fitting subset of calibration samples with optimally predictive ability. This best fitting calibration subset is called the optimally predictive calibration subset (OPCS). After ranking the global calibration samples according to increasing residuals, different enlarging fractions of the ranked calibration set are selected. For each fraction, the optimal predictive ability and the corresponding optimal PLS complexity are determined by cross model validation (CMV). After performing CMV with all fractions, the fraction with the best fitting samples and optimally predictive ability, i.e. the OPCS, is determined. The use of the best fitting samples from the global PLS model results in an OPCS-based model which is similar to the global PLS model and has a similar predictive ability. Because the best fitting samples do not need to be representative for the global calibration set, but only need to support the OPCS-based model, the number of samples in the OPCS model is mostly smaller than that selected by a traditional representative sample subset selection method. The new OPCS approach is tested on three real life NIR data sets with twelve X-y combinations to model. The results show that the number of selected samples obtained by the OPCS approach is statistically significantly lower than (i) that of the most suitable and widely used representative sample selection method of Kennard and Stone, and (ii) that suggested by the guideline that the optimal sample size N for reduced calibration sets should surpass the PLS model complexity A by a factor 12. An additional advantage of the OPCS approach is that no outliers are included in the subset because only the best fitting calibration samples are selected. In the new OPCS approach, two additional innovations are built in: (i) CMV is for the first time applied for sample selection and (ii) in CMV, the "one standard error rule", adopted from "Repeated Double Cross Validation", is for the first time used for the determination of the optimal PLS complexity of the OPCS-based models.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.talanta.2023.124943DOI Listing

Publication Analysis

Top Keywords

best fitting
28
predictive ability
20
calibration set
16
fitting samples
16
optimally predictive
16
pls model
16
opcs approach
16
global pls
12
global calibration
12
calibration samples
12

Similar Publications

Rate equations and numerical simulations relying on complex mathematical and physical principles are typically used to model directly modulated lasers (DMLs) but have difficulty simulating dynamic DML behavior in real-time under varying conditions due to their high complexity. Here, we introduce a data-driven deep learning method to model DMLs, aiming to achieve high accuracy with reduced computational complexity. This approach employs bidirectional long short-term memory (BiLSTM) enhanced by advanced feature recalibration and nonlinear fitting techniques.

View Article and Find Full Text PDF

Background: In infected hosts, immune responses trigger a systemic energy reallocation away from energy storage and growth, to fuel a costly defense program. The exact energy costs of immune defense are however unknown in general. Life history theory predicts that such costs underpin trade-offs between host disease resistance and other fitness related traits, yet this has been seldom assessed.

View Article and Find Full Text PDF

The inhibitory potential of an alcoholic extract derived from Canarium strictum leaves (CSL) was evaluated as a corrosion inhibitor for mild steel (MS) in 15% HCl solution. Furthermore, to enhance its inhibition effectiveness, the influence of potassium iodide (KI) was also examined. The corrosion inhibition and adsorption characteristics of CSL were comprehensively analysed through weight loss measurement, electrochemical impedance measurement (EIS), potentiodynamic polarization (PP), UV-visible spectroscopy, Fourier transform infrared spectroscopy (FTIR), atomic force microscopy (AFM), scanning electron microscopy and energy dispersive spectroscopy (SEM-EDS).

View Article and Find Full Text PDF

Allometric equations for estimating above and belowground biomass of Colophospermum mopane in Mozambique.

Sci Rep

January 2025

Department of Forest Engineering, Faculty of Agronomy and Forest Engineering, Eduardo Mondlane University, P.O. Box 257, Maputo, Mozambique.

Seasonally dry tropical woodlands are vital for climate change mitigation, yet their full potential in carbon storage remains poorly understood. This is largely due to the lack of species-specific allometric models tailored to these ecosystems. To address this knowledge gap, this study aimed to develop species-specific biomass allometric equations (BAEs) for accurately estimating both above- and below-ground biomass of Colophospermum mopane (J.

View Article and Find Full Text PDF

With the emergence of numerous classifications, surgical treatment for adolescent idiopathic scoliosis (AIS) can be guided more effectively. However, surgical decision-making and optimal strategies still lack standardization and personalized customization. Our study aims to devise proper deep learning (DL) models that incorporate key factors influencing surgical outcomes on the coronal plane in AIS patients to facilitate surgical decision-making and predict surgical results for AIS patients.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!