Partial least squares regression - or PLS regression - is a multivariate method in which the model parameters are estimated using either the SIMPLS or NIPALS algorithm. PLS regression has been extensively used in applied research because of its effectiveness in analyzing relationships between an outcome and one or several components. Note that the NIPALS algorithm can provide estimates parameters on incomplete data. The selection of the number of components used to build a representative model in PLS regression is a central issue. However, how to deal with missing data when using PLS regression remains a matter of debate. Several approaches have been proposed in the literature, including the Q2 criterion, and the AIC and BIC criteria. Here we study the behavior of the NIPALS algorithm when used to fit a PLS regression for various proportions of missing data and different types of missingness. We compare criteria to select the number of components for a PLS regression on incomplete data set and on imputed data set using three imputation methods: multiple imputation by chained equations, k-nearest neighbour imputation, and singular value decomposition imputation. We tested various criteria with different proportions of missing data (ranging from 5% to 50%) under different missingness assumptions. Q2-leave-one-out component selection methods gave more reliable results than AIC and BIC-based ones.

Download full-text PDF

Source
http://dx.doi.org/10.1515/sagmb-2018-0059DOI Listing

Publication Analysis

Top Keywords

pls regression
28
number components
12
incomplete data
12
data set
12
nipals algorithm
12
missing data
12
components pls
8
regression
8
regression incomplete
8
proportions missing
8

Similar Publications

Additively manufactured drug products, typically produced using small-scale, on-demand batch mode, require rapid and non-destructive quantification methods. A tunable modular design (TMD) approach combining porous polymeric freeze-dried modules and an additive manufacturing method, inkjet printing, was proposed in an earlier study to fabricate accurate and patient-tailored doses of an antidepressant citalopram hydrobromide. This approach addresses the unmet medical needs associated with antidepressant tapering.

View Article and Find Full Text PDF

Whilst livestock management technologies may help to improve productivity, economic performance, and animal welfare on farms, there has been low uptake of technologies across farming systems and countries. This study aimed to understand dairy farmers' intention to use calf management technologies by combining partial least squares structural equation modelling (PLS-SEM) with qualitative comparative analysis (QCA). We evaluated the hypotheses that dairy farmers will intend to use calf technologies if they have sufficient competencies, sufficient materials, and positive meanings (e.

View Article and Find Full Text PDF

Unravelling the complex influence of dissolved organic matter on microbial diversity in a salinized lake.

Environ Res

January 2025

Institute of Surface-Earth System Science, School of Earth System Science, Tianjin University, Tianjin, 300072, China.

Ecosystems in cold and arid regions, such as Dai Lake - a typical inland, salinized lake in the semi-arid region of northern China - face severe environmental challenges, including salinization and biodiversity loss. This study investigates the chemical composition of dissolved organic matter (DOM) and the structure of microbial communities in lake water and sediments, offering novel insights into the ecosystem's dynamics. In winter, DOM in the lake water is primarily derived from decaying plant and animal matter, while sediment DOM is predominantly associated with microbial activity.

View Article and Find Full Text PDF

Background: Burn-hemorrhagic shock combined injury, a severe condition causing complex stress responses and metabolic disturbances that significantly affect clinical outcomes in both military and civilian settings, was modeled in swine to investigate the associated metabolomic and proteomic changes and identify potential biomarkers for disease prognosis.

Methods: Eight clean-grade adult male Landrace pigs (4-5 months, average weight 60-70 kg) were used to model burn-hemorrhagic shock combined injury. Serum samples collected at 0 h and 2 h post-injury were analyzed using metabolomic and proteomic measurements.

View Article and Find Full Text PDF

The microbial community of a soy sauce is one of the most important factors in determining the sensory characteristics of that soy sauce. In this study, the microbial communities and sensory characteristics of twenty samples of Korean soy sauce () were investigated using shotgun metagenome sequencing and descriptive sensory analysis, and their correlations were explored by partial least square (PLS) regression analysis. The metagenome analysis identified 1332 species of bacteria, yeasts, molds, and viruses across 278 genera, of which , , and accounted for more than 80% of the total community.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!