LegNet: a best-in-class deep learning model for short DNA regulatory regions.

Dmitry Penzar Daria Nogina Elizaveta Noskova Arsenii Zinkevich Georgy Meshcheryakov Andrey Lando Abdul Muntakim Rafi Carl de Boer Ivan V Kulakovskiy

Bioinformatics

Vavilov Institute of General Genetics, Moscow 119991, Russia.

Published: August 2023

Motivation: The increasing volume of data from high-throughput experiments including parallel reporter assays facilitates the development of complex deep-learning approaches for modeling DNA regulatory grammar.

Results: Here, we introduce LegNet, an EfficientNetV2-inspired convolutional network for modeling short gene regulatory regions. By approaching the sequence-to-expression regression problem as a soft classification task, LegNet secured first place for the autosome.org team in the DREAM 2022 challenge of predicting gene expression from gigantic parallel reporter assays. Using published data, here, we demonstrate that LegNet outperforms existing models and accurately predicts gene expression per se as well as the effects of single-nucleotide variants. Furthermore, we show how LegNet can be used in a diffusion network manner for the rational design of promoter sequences yielding the desired expression level.

Availability And Implementation: https://github.com/autosome-ru/LegNet. The GitHub repository includes Jupyter Notebook tutorials and Python scripts under the MIT license to reproduce the results presented in the study.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10400376	PMC
http://dx.doi.org/10.1093/bioinformatics/btad457	DOI Listing

Publication Analysis

Top Keywords

dna regulatory

regulatory regions

parallel reporter

reporter assays

gene expression

legnet

legnet best-in-class

best-in-class deep

deep learning

learning model

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!