Anal Chim Acta
September 2022
Concepts from data science, machine learning, deep learning and artificial neural networks are spreading in many disciplines. The general idea is to exploit the power of statistical tools to interpret complex and, in many cases, non-linear data. Specifically in analytical chemistry, many chemometrics tools are being developed.
View Article and Find Full Text PDFThe use of machine learning for multivariate spectroscopic data analysis in applications related to process monitoring has become very popular since non-linearities in the relationship between signal and predicted variables are commonly observed. In this regard, the use of artificial neural networks (ANN) to develop calibration models has demonstrated to be more appropriate and flexible than classical multivariate linear methods. The most frequently reported type of ANN is the so-called multilayer perceptron (MLP).
View Article and Find Full Text PDFA new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements.
View Article and Find Full Text PDFMultivariate curve resolution-alternating least-squares (MCR-ALS) is the model of choice when dealing with some non-trilinear arrays, specifically when the data are of chromatographic origin. To drive the iterative procedure to chemically interpretable solutions, the use of constraints becomes essential. In this work, both simulated and experimental data have been analyzed by MCR-ALS, applying chemically reasonable constraints, and investigating the relationship between selectivity, analytical sensitivity (γ) and root mean square error of prediction (RMSEP).
View Article and Find Full Text PDFGeneralized analytical sensitivity (γ) is proposed as a new figure of merit, which can be estimated from a multivariate calibration data set. It can be confidently applied to compare different calibration methodologies, and helps to solve literature inconsistencies on the relationship between classical sensitivity and prediction error. In contrast to the classical plain sensitivity, γ incorporates the noise properties in its definition, and its inverse is well correlated with root mean square errors of prediction in the presence of general noise structures.
View Article and Find Full Text PDFWith the proliferation of multivariate calibration methods based on artificial neural networks, expressions for the estimation of figures of merit such as sensitivity, prediction uncertainty, and detection limit are urgently needed. This would bring nonlinear multivariate calibration methodologies to the same status as the linear counterparts in terms of comparability. Currently only the average prediction error or the ratio of performance to deviation for a test sample set is employed to characterize and promote neural network calibrations.
View Article and Find Full Text PDFMost of the current expressions used to calculate figures of merit in multivariate calibration have been derived assuming independent and identically distributed (iid) measurement errors. However, it is well known that this condition is not always valid for real data sets, where the existence of many external factors can lead to correlated and/or heteroscedastic noise structures. In this report, the influence of the deviations from the classical iid paradigm is analyzed in the context of error propagation theory.
View Article and Find Full Text PDFRecent advances in fluorescence bioimaging with single-molecule sensitivity have relied on the analysis and visualization of single-molecule data obtained on smart fluorophores. We describe an alternative method to enhance the information content of densely labeled fluorescence images. Visualization is improved by representing pixels as the dissimilarities of the fluctuations of the fluorescence signals, with the dissimilarity being taken to the mean of the signals over all the pixels.
View Article and Find Full Text PDFThere is currently no well-defined procedure for providing the limit of detection (LOD) in multivariate calibration. Defining an estimator for the LOD in this scenario has shown to be more complex than intuitively extending the traditional univariate definition. For these reasons, although many attempts have been made to arrive at a reasonable convention, additional effort is required to achieve full agreement between the univariate and multivariate LOD definitions.
View Article and Find Full Text PDFA new optimization strategy for multivariate partial-least-squares (PLS) regression analysis is described. It was achieved by integrating three efficient strategies to improve PLS calibration models: (1) variable selection based on ant colony optimization, (2) mathematical pre-processing selection by a genetic algorithm, and (3) sample selection through a distance-based procedure. Outlier detection has also been included as part of the model optimization.
View Article and Find Full Text PDFA new expression is developed which allows estimating the sensitivity for the whole family of multivariate calibration algorithms based on partial least-squares regression combined with residual multilinearization. The sensitivity can be employed to compute other relevant figures of merit such as analytical sensitivity, limit of detection, limit of quantitation, and uncertainty in predicted concentration. The results are substantiated by extensive Monte Carlo noise addition simulations for a variety of systems with a different number of analytes and interfering agents, different degrees of overlapping in component profiles, and different numbers of instrumental data modes per sample, all requiring the achievement of the second-order advantage.
View Article and Find Full Text PDFA new variable selection algorithm is described, based on ant colony optimization (ACO). The algorithm aim is to choose, from a large number of available spectral wavelengths, those relevant to the estimation of analyte concentrations or sample properties when spectroscopic analysis is combined with multivariate calibration techniques such as partial least-squares (PLS) regression. The new algorithm employs the concept of cooperative pheromone accumulation, which is typical of ACO selection methods, and optimizes PLS models using a pre-defined number of variables, employing a Monte Carlo approach to discard irrelevant sensors.
View Article and Find Full Text PDF