Hybrid variable selection strategy coupled with random forest (RF) for quantitative analysis of methanol in methanol-gasoline via Raman spectroscopy.

Spectrochim Acta A Mol Biomol Spectrosc

Key Laboratory of Synthetic and Natural Functional Molecule of the Ministry of Education, College of Chemistry & Materials Science, Northwest University, Xi'an, 710127, China; College of Chemistry and Chemical Engineering, Xi'an Shiyou University, Xi'an, 710065, China. Electronic address:

Published: April 2021

With the trend of portable and miniaturization, Raman spectrometer requires more advanced analytical methods providing more rapid and accurate analysis performance for in-situ analysis. In this work, a hybrid variable selection method based on V-WSP and variable importance measurement (VIM) coupled with random forest (RF) was used to improve the quantitative analysis performance of portable laser Raman instruments for quantitative analysis of methanol content in methanol gasoline. First, five preprocessing methods were applied to reduce the infection information in the raw spectra, respectively. Based on the spectra data processed by multivariate scattering correction (MSC), V-WSP was employed to filter the infection or redundant information in Raman spectroscopy, and 579 variables were obtained when the correlation threshold is 0.9600. Then, the variables were further eliminated by VIM. Finally, 43 variables were obtained by the V-WSP-VIM method. In data processing, out of bag (OOB) error estimation and 10-flod cross validation (CV) were applied to optimize the parameters of preprocessing methods, V-WSP, VIM and RF model. The results fully demonstrated that compared with the RF model based on raw spectra, the RF model based on V-WSP-VIM method can achieve a better prediction performance for the quantitative analysis of methanol content in methanol-gasoline, with the coefficients of determination of cross-validation (R) improving from 0.9100 to 0.9662, the root mean square error of cross-validation (RMSE) reducing from 0.0572 to 0.0365%, the coefficients of determination of prediction set (R) improving from 0.9214 to 0.9407, the root mean square error of prediction set (RMSE) reducing from 0.0420 to 0.0382%, the variables reducing from 1044 to 43 and the modeling time reducing from 72.94 to 6.41 s. The results indicates that V-WSP-VIM coupled with RF is an effective method to improve the performance of portable laser Raman spectrometer for quantitative analysis of methanol content in methanol gasoline.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.saa.2021.119430DOI Listing

Publication Analysis

Top Keywords

quantitative analysis
20
analysis methanol
16
methanol content
12
hybrid variable
8
variable selection
8
coupled random
8
random forest
8
raman spectroscopy
8
raman spectrometer
8
analysis performance
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!