Random forests (RFs) are effective at predicting gene expression from genotype data. However, a comparison of RF regressors and classifiers, including feature selection and encoding, has been under-explored in the context of gene expression prediction. Specifically, we examine the role of ordinal or one-hot encoding and of data balancing via oversam-pling in the prediction of obesity-associated gene expression. Our work shows that RFs compete with PrediXcan in the prediction of obesity-associated gene expression in subcutaneous adipose tissue, a highly relevant tissue to obesity. Additionally, RFs generate predictions for obesity-associated genes where PrediXcan fails to do so.

Download full-text PDF

Source
http://dx.doi.org/10.1109/EMBC48229.2022.9871234DOI Listing

Publication Analysis

Top Keywords

gene expression
20
obesity-associated gene
12
random forests
8
prediction obesity-associated
8
gene
5
expression
5
adapting random
4
forests predict
4
obesity-associated
4
predict obesity-associated
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!