CrepHAN: cross-species prediction of enhancers by using hierarchical attention networks.

Bioinformatics

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.

Published: October 2021

Motivation: Enhancers are important functional elements in genome sequences. The identification of enhancers is a very challenging task due to the great diversity of enhancer sequences and the flexible localization on genomes. Till now, the interactions between enhancers and genes have not been fully understood yet. To speed up the studies of the regulatory roles of enhancers, computational tools for the prediction of enhancers have emerged in recent years. Especially, thanks to the ENCODE project and the advances of high-throughput experimental techniques, a large amount of experimentally verified enhancers have been annotated on the human genome, which allows large-scale predictions of unknown enhancers using data-driven methods. However, except for human and some model organisms, the validated enhancer annotations are scarce for most species, leading to more difficulties in the computational identification of enhancers for their genomes.

Results: In this study, we propose a deep learning-based predictor for enhancers, named CrepHAN, which is featured by a hierarchical attention neural network and word embedding-based representations for DNA sequences. We use the experimentally supported data of the human genome to train the model, and perform experiments on human and other mammals, including mouse, cow and dog. The experimental results show that CrepHAN has more advantages on cross-species predictions, and outperforms the existing models by a large margin. Especially, for human-mouse cross-predictions, the area under the receiver operating characteristic (ROC) curve (AUC) score of ROC curve is increased by 0.033∼0.145 on the combined tissue dataset and 0.032∼0.109 on tissue-specific datasets.

Availability And Implementation: bcmi.sjtu.edu.cn/∼yangyang/CrepHAN.html.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab349DOI Listing

Publication Analysis

Top Keywords

enhancers
10
prediction enhancers
8
hierarchical attention
8
identification enhancers
8
human genome
8
roc curve
8
crephan cross-species
4
cross-species prediction
4
enhancers hierarchical
4
attention networks
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!