Publications by authors named "Shungeng Min"

Problems: Raman spectroscopy has emerged as an effective technique that can be used for noninvasive breast cancer analysis. However, the current Raman prediction models fail to cover all the molecular sub-types of breast cancer, and lack the visualization of the model.

Aims: Using Raman spectroscopy combined with convolutional neural network (CNN) to construct a prediction model for the existing known molecular sub-types of breast cancer, and selected critical peaks through visualization strategies, so as to achieve the purpose of mining specific biomarker information.

View Article and Find Full Text PDF

Quick identification of paper types for customs is extremely crucial. Although there are a variety of researches focus on the discrimination of paper, these techniques either require complex preprocessing or large-scale instruments, which are not suitable for customs environments. In this study, we predicted the type of customs paper by using a Micro-NIR spectrometer, and compared the results with Attenuated Total Reflection-Fourier Transform Infrared Spectroscopy (ATR-FTIR).

View Article and Find Full Text PDF

Rapid and reliable animal fur identification has remained a challenge for customs inspection. The accurate distinction between fur types has a significant meaning in implementing the correct tariff policy. A variety of analytical methods have been applied to work on distinguishing animal fur types, with tools of microscopy, molecular testing, mass spectrometry, Fourier transform infrared spectroscopy (FTIR), and Raman spectroscopy.

View Article and Find Full Text PDF

Data-driven deep learning analysis, especially for convolution neural network (CNN), has been developed and successfully applied in many domains. CNN is regarded as a black box, and the main drawback is the lack of interpretation. In this study, an interpretable CNN model was presented for infrared data analysis.

View Article and Find Full Text PDF

A non-destructive method based on Fourier Transformed Infrared Spectroscopy (FT-IR) was proposed to estimate the date of paper from different years in this article. For the paper samples, dated from 1940 to 1980, naturally aged and conserved in library. Partial least squares-discriminate analysis (PLS-DA), Logistic regression and convolutional neural network (CNN), were employed to evaluate the date of paper, with the accuracy 60.

View Article and Find Full Text PDF

Spectral data fusion strategies combined with the extreme learning machine (ELM) algorithm was applied to determine the active ingredient in deltamethrin formulation. Ultraviolet-visible spectroscopy (UV-vis) is a rapid and sensitive detection method for specific components that are sensitive to ultraviolet irradiation. Alternatively, near-infrared spectroscopy (NIR) technology can be applied over a broader range.

View Article and Find Full Text PDF

Near infrared spectra (NIR) technology is a widespread detection method with high signal to noise ratio (SNR) while has poor modeling interpretation due to the overlapped features. Alternatively, mid-infrared spectra (MIR) technology demonstrates more chemical features and gives a better explanation of the model. Yet, it has the defects of low SNR.

View Article and Find Full Text PDF

Feature selection can greatly enhance the performance of a learning algorithm when dealing with a high dimensional data set. The filter method and the wrapper method are the two most commonly approaches. However, these approaches have limitations.

View Article and Find Full Text PDF

Qualitative spectroscopic analysis depends in one way or another on comparing spectra of the specimens to be identified with spectra of "known" or "standard" samples. The k-nearest neighbor (k-NN) method is one of the oldest and simplest techniques for performing such comparisons. In this study, we present a new k-NN algorithm for qualitative spectroscopic analysis, which we refer to as the bootstrapping search margin-based nearest neighbor (BSMNN) method.

View Article and Find Full Text PDF

A novel method, named as modeling indicator determined (MID) method, based on two model evaluation parameters i.e., root mean square error of prediction (RMSEP) and ratio performance deviation (RPD), is proposed to employ high-level fusion for quantitative analysis.

View Article and Find Full Text PDF

The paper relic identification is a pending issue to be resolved in the field of cultural heritage. As we all known, heritage paper has significant importance in archaeological research. Nowadays, there are a variety of research methodologies focuses on the analysis of inks for dating documents.

View Article and Find Full Text PDF

A novel spectral variable selection method, named as interval combination optimization (ICO), was proposed in the previous study of us. In the present study, ICO coupled with near infrared (NIR) spectroscopy was applied to the rapid determination of four primary constituents including total sugar, reducing sugar, total nitrogen and nicotine in plant. Partial least squares regressions was performed after ICO algorithm.

View Article and Find Full Text PDF

Wavelength selection is a critical factor for pattern recognition of vibrational spectroscopic data. Not only does it alleviate the effect of dimensionality on an algorithm's generalization performance, but it also enhances the understanding and interpretability of multivariate classification models. In this study, a novel partial least squares discriminant analysis (PLSDA)-based wavelength selection algorithm, termed ensemble of bootstrapping space shrinkage (EBSS), has been devised for vibrational spectroscopic data analysis.

View Article and Find Full Text PDF

A novel chemometrical method, named as MWS-ECARS, which is based on using the moving window smoothing upon an ensemble of competitive adaptive reweighted sampling, is proposed as the spectral variable selection approach for multivariate calibration in this study. In terms of elimination of uninformative variables, an ensemble of CARS is carried out first and MWS is then performed to search for effective variables around the high frequency variables. The variable subset with the lowest standard error of cross-validation (SECV) is treated as the optimal threshold and the corresponding moving window width is regarded as the optimal window width.

View Article and Find Full Text PDF

Background: Acetamiprid, as a low-toxicity pesticide, has already been extensively used to increase plant production and quality. Although fipronil has been prohibited, it is usually illicitly added to acetamiprid due to its particular insecticidal action and effect, so it is highly desirable to obtain a rapid and effective method to detect its concentration. Mid-infrared spectroscopy (MIR) combined with two variable selection methods, interval combination optimization (ICO) and interval partial least squares (iPLS), were used to determinate the prohibited addition of fipronil.

View Article and Find Full Text PDF

In this study, we proposed a new computational method stabilized bootstrapping soft shrinkage approach (SBOSS) for variable selection based on bootstrapping soft shrinkage approach (BOSS) which can enhance the analysis of chemical interest from the massive variables among the overlapped absorption bands. In SBOSS, variable is selected by the index of stability of regression coefficients instead of regression coefficients absolute value. In each loop, a weighted bootstrap sampling (WBS) is applied to generate sub-models, according to the weights update by conducting model population analysis (MPA) on the stability of regression coefficients (RC) of these sub-models.

View Article and Find Full Text PDF

Iodine value (IV) is a significant parameter to illustrate the quality of edible oil. In this study, three portable spectroscopy devices were employed to determine IV in mixed edible oil system, a new Micro-Electro-Mechanical-System (MEMS) Fourier Transform Infrared Spectrometer (MEMS-FTIR), a MicroNIR1700 and an i-Raman Plus-785S. Quantitative model was built by Partial least squares (PLS) regression model and four variable selection methods were applied before PLS model, which are Monte Carlo uninformative variables elimination (MCUVE), competitive reweighted sampling (CARS), bootstrapping soft shrinkage approach (BOSS) and variable combination population analysis (VCPA).

View Article and Find Full Text PDF

Rapid and nondestructive near infrared spectroscopy (NIR) methods have been developed for simultaneous qualitative and quantitative analysis of methamphetamine, ketamine, heroin, and cocaine in seized samples. This is the first systematic report regarding a qualitative and quantitative procedure of applying NIR for drug analysis. A total of 282 calibration samples and 836 prediction samples were used for the building and validating of qualitative and quantitative models.

View Article and Find Full Text PDF

PARAFAC2 is a powerful decomposition method which is ideally suited for modeling gas chromatography-mass spectrometry (GC-MS) data. However, the most widely used fitting algorithms (alternating least squares, ALS) are very slow which hinders use of the model. In this paper, an iterative method called geometric search is proposed to fit the PARAFAC2 model.

View Article and Find Full Text PDF

A novel method, mid-infrared (MIR) spectroscopy, which enables the determination of Chlorantraniliprole in Abamectin within minutes, is proposed. We further evaluate the prediction ability of four wavelength selection methods, including bootstrapping soft shrinkage approach (BOSS), Monte Carlo uninformative variable elimination (MCUVE), genetic algorithm partial least squares (GA-PLS) and competitive adaptive reweighted sampling (CARS) respectively. The results showed that BOSS method obtained the lowest root mean squared error of cross validation (RMSECV) (0.

View Article and Find Full Text PDF

In this study, a new wavelength interval selection algorithm named as interval combination optimization (ICO) was proposed under the framework of model population analysis (MPA). In this method, the full spectra are divided into a fixed number of equal-width intervals firstly. Then the optimal interval combination is searched iteratively under the guide of MPA in a soft shrinkage manner, among which weighted bootstrap sampling (WBS) is employed as random sampling method.

View Article and Find Full Text PDF

A mixture of four substances of benzaldehyde, iso-octane, butyl acetate, acetophenone were quantitatively analyzed by mass spectrometry combined with chemometrics. The mass chromatogram data of mixture were proceeded with two methods for quantitative analysis. One is feature selection--Multiple Linear Regression (MLR) and the other is full spectrum--Partial Least Squares (PLS).

View Article and Find Full Text PDF

A simple and rapid method was developed for evaluating 16 phthalic acid esters (PAEs) at the μg/kg level in a complex milk matrix using directly suspended droplet microextraction-gas chromatography mass spectrometry (DSDME-GC-MS). The different parameters for extraction and for the DSDME experiment were optimized, including You are free to submit the revised manuscript at a later date as a new submission. 10 g/L trichloroacetic acid concentration, 100 μL cyclohexane micro-droplet organic solvent, 1100 rpm stirring speed, 10 min extraction time and no salt amount.

View Article and Find Full Text PDF
Article Synopsis
  • - The paper explores a method for testing calcium and magnesium levels in tobacco using Near-Infrared (NIR) spectroscopy combined with Least Squares-Support Vector Machine (LS-SVM) modeling, analyzing 500 tobacco samples from China.
  • - Monte-Carlo cross validation was employed to fine-tune spectral preprocessing methods, and the final models yielded high determination coefficients for predicting calcium (Rc(2) = 0.9755) and magnesium (Rc(2) = 0.9961), outperforming traditional Partial Least Squares (PLS) models.
  • - The entire process of NIR scanning and data analysis for individual samples was quick, taking only seconds, demonstrating the potential of this method for efficient and accurate tobacco
View Article and Find Full Text PDF

This paper presents a rapid calculation method for the imaging process in the identification and quantification of prohibited additives in milk. Data abstraction methods such as principal component analysis (PCA), classical least squares regression (CLS), and alternative least squares regression (ALS) were used. Different multivariate calculations provided possibilities of quantifying near-infrared (NIR) spectral data cube obtained from the surface of the complex mixture.

View Article and Find Full Text PDF