Machine learning approaches are an attractive option for analyzing large-scale data to detect genetic variants that contribute to variation of a quantitative trait, without requiring specific distributional assumptions. We evaluate two machine learning methods, random forests and logic regression, and compare them to standard simple univariate linear regression, using the Genetic Analysis Workshop 17 mini-exome data. We also apply these methods after collapsing multiple rare variants within genes and within gene pathways. Linear regression and the random forest method performed better when rare variants were collapsed based on genes or gene pathways than when each variant was analyzed separately. Logic regression performed better when rare variants were collapsed based on genes rather than on pathways.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287827PMC
http://dx.doi.org/10.1186/1753-6561-5-S9-S104DOI Listing

Publication Analysis

Top Keywords

logic regression
12
rare variants
12
random forests
8
forests logic
8
machine learning
8
linear regression
8
genes gene
8
gene pathways
8
performed better
8
better rare
8

Similar Publications

Factors associated with mortality in thoracic trauma patients in Cameroon.

Heliyon

January 2025

Data Science Center for the Study of Surgery, Injury, and Equity in Africa (D-SINE-Africa), University of Buea, Cameroon.

Background: Thoracic trauma is the third most common cause of death after abdominal injury and head trauma in polytrauma patients. The identification of a patient at risk of thoracic trauma mortality is necessary to avoid delays that may lead to morbidity and mortality. Therefore, the objective was to assess the factors associated with mortality among thoracic trauma patients in Cameroon.

View Article and Find Full Text PDF

While drug combinations are increasingly important in disease treatment, predicting their therapeutic interactions remains challenging. This review systematically analyzes computational methods for predicting drug combination effects through multi-omics data integration. We comprehensively assess key algorithms including DrugComboRanker and AuDNNsynergy, and evaluate integration approaches encompassing kernel regression and graph networks.

View Article and Find Full Text PDF

Unique organoleptic and flavor attributes of Jinhua ham are associated with their qualities. However, methods for quickly predicting the grade of hams, sensory scores and key flavor substances have not been systematically established. This study used sensory evaluation and E-nose to analyze the sensory differences for different grades of Jinhua ham.

View Article and Find Full Text PDF

Biological age estimation from DNA methylation and determination of relevant biomarkers is an active research problem which has predominantly been tackled with black-box penalized regression. Machine learning is used to select a small subset of features from hundreds of thousands of CpG probes and to increase generalizability typically lacking with ordinary least-squares regression. Here, we show that such feature selection lacks biological interpretability and relevance in the clocks of the first and next generations and clarify the logic by which these clocks systematically exclude biomarkers of aging and age-related disease.

View Article and Find Full Text PDF

Global climate change has become one of the most large-scale, widespread, and far-reaching challenges facing mankind. Against this background, China has proposed a "dual-carbon" target in 2020, which greatly demonstrates China's determination and commitment to carbon emission reduction, and the burden of realizing the "dual-carbon" target is mainly borne by heavy polluters. The burden of achieving the "dual-carbon" goal is mainly borne by the heavily polluting firms.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!