Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in 4 years (2016-2018 and 2020) and a diversity panel phenotyped in 4 years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using ridge regression best linear unbiased prediction and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Furthermore, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8904966PMC
http://dx.doi.org/10.3389/fgene.2022.835781DOI Listing

Publication Analysis

Top Keywords

regression models
16
models
12
skewed phenotypes
12
prediction models
12
classification regression
8
genomic selection
8
winter wheat
8
stripe rust
8
classification models
8
regression classification
8

Similar Publications

Background: Macrophages play a dual role in the tumor microenvironment(TME), capable of secreting pro-inflammatory factors to combat tumors while also promoting tumor growth through angiogenesis and immune suppression. This study aims to explore the characteristics of macrophages in lung adenocarcinoma (LUAD) and establish a prognostic model based on macrophage-related genes.

Method: We performed scRNA-seq analysis to investigate macrophage heterogeneity and their potential pseudotime evolutionary processes.

View Article and Find Full Text PDF

Background: Colon adenocarcinoma (COAD) is a malignancy with a high mortality rate and complex biological characteristics and heterogeneity, which poses challenges for clinical treatment. Anoikis is a type of programmed cell death that occurs when cells lose their attachment to the extracellular matrix (ECM), and it plays a crucial role in tumor metastasis. However, the specific biological link between anoikis and COAD, as well as its mechanisms in tumor progression, remains unclear, making it a potential new direction for therapeutic strategy research.

View Article and Find Full Text PDF

Graphical models have long been studied in statistics as a tool for inferring conditional independence relationships among a large set of random variables. The most existing works in graphical modeling focus on the cases that the data are Gaussian or mixed and the variables are linearly dependent. In this paper, we propose a double regression method for learning graphical models under the high-dimensional nonlinear and non-Gaussian setting, and prove that the proposed method is consistent under mild conditions.

View Article and Find Full Text PDF

Background: Statins have been shown to reduce atherosclerotic cardiovascular disease (ASCVD). In the United States, statins are underutilized, and the literature suggests women and Latine individuals received even fewer prescriptions than men even when eligible. No study has shown how statins are prescribed when looking at language, ethnicity, and considering sex.

View Article and Find Full Text PDF

Background And Purpose: To evaluate the association between sleep-related factors, including sleep duration, self-reported sleep disturbances, and diagnosed sleep disorders, and the risk of cardiovascular disease (CVD) in US participants.

Methods: The data of this study from the National Health and Nutrition Examination Survey (NHANES) conducted between 2007 and 2014. Sleep factors were assessed using a standardized questionnaire, and overall sleep scores were calculated on a scale of 0 to 3.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!