A multi-modal transformer for cell type-agnostic regulatory predictions.

Nauman Javed Thomas Weingarten Arijit Sehanobish Adam Roberts Avinava Dubey Krzysztof Choromanski Bradley E Bernstein

Cell Genom

The Gene Regulation Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Cell Biology and Pathology, Harvard Medical School, Boston, MA 02215, USA; The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA. Electronic address:

Published: January 2025

Sequence-based deep learning models have emerged as powerful tools for deciphering the cis-regulatory grammar of the human genome but cannot generalize to unobserved cellular contexts. Here, we present EpiBERT, a multi-modal transformer that learns generalizable representations of genomic sequence and cell type-specific chromatin accessibility through a masked accessibility-based pre-training objective. Following pre-training, EpiBERT can be fine-tuned for gene expression prediction, achieving accuracy comparable to the sequence-only Enformer model, while also being able to generalize to unobserved cell states. The learned representations are interpretable and useful for predicting chromatin accessibility quantitative trait loci (caQTLs), regulatory motifs, and enhancer-gene links. Our work represents a step toward improving the generalization of sequence-based deep neural networks in regulatory genomics.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.xgen.2025.100762	DOI Listing

Publication Analysis

Top Keywords

multi-modal transformer

sequence-based deep

generalize unobserved

chromatin accessibility

transformer cell

cell type-agnostic

type-agnostic regulatory

regulatory predictions

predictions sequence-based

deep learning

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!