Prediction of the transcription factor binding sites with meta-learning.

Methods

NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China. Electronic address:

Published: July 2022

With the accumulation of ChIP-seq data, convolution neural network (CNN)-based methods have been proposed for predicting transcription factor binding sites (TFBSs). However, biological experimental data are noisy, and are often treated as ground truth for both training and testing. Particularly, existing classification methods ignore the false positive and false negative which are caused by the error in the peak calling stage, and therefore, they can easily overfit to biased training data. It leads to inaccurate identification and inability to reveal the rules of governing protein-DNA binding. To address this issue, we proposed a meta learning-based CNN method (namely TFBS_MLCNN or MLCNN for short) for suppressing the influence of noisy labels data and accurately recognizing TFBSs from ChIP-seq data. Guided by a small amount of unbiased meta-data, MLCNN can adaptively learn an explicit weighting function from ChIP-seq data and update the parameter of classifier simultaneously. The weighting function overcomes the influence of biased training data on classifier by assigning a weight to each sample according to its training loss. The experimental results on 424 ChIP-seq datasets show that MLCNN not only outperforms other existing state-of-the-art CNN methods, but can also detect noisy samples which are given the small weights to suppress them. The suppression ability to the noisy samples can be revealed through the visualization of samples' weights. Several case studies demonstrate that MLCNN has superior performance to others.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ymeth.2022.04.010DOI Listing

Publication Analysis

Top Keywords

chip-seq data
12
transcription factor
8
factor binding
8
binding sites
8
biased training
8
training data
8
weighting function
8
noisy samples
8
data
7
prediction transcription
4

Similar Publications

Transcription introduces torsional stress in the DNA fiber causing it to transition from a relaxed to a supercoiled state that can propagate across several kilobases and modulate the binding and activity of DNA-associated proteins. As a result, transcription at one locus has the potential to impact nearby transcription events. In this study, we asked how DNA supercoiling affects histone modifications and transcription of neighboring genes in the multicellular eukaryote .

View Article and Find Full Text PDF

Gene expression is coordinated by a multitude of transcription factors (TFs), whose binding to the genome is directed through multiple interconnected epigenetic signals, including chromatin accessibility and histone modifications. These complex networks have been shown to be disrupted during aging, disease, and cancer. However, profiling these networks across diverse cell types and states has been limited due to the technical constraints of existing methods for mapping DNA:Protein interactions in single cells.

View Article and Find Full Text PDF

Single-cell multi-omics techniques, which enable the simultaneous measurement of multiple modalities such as RNA gene expression and Assay for Transposase-Accessible Chromatin (ATAC) within individual cells, have become a powerful tool for deciphering the intricate complexity of cellular systems. Most current methods rely on motif databases to establish cross-modality relationships between genes from RNA-seq data and peaks from ATAC-seq data. However, these approaches are constrained by incomplete database coverage, particularly for novel or poorly characterized relationships.

View Article and Find Full Text PDF

N6-methyladenosine RNA modification regulates the transcription of SLC7A11 through KDM6B and GATA3 to modulate ferroptosis.

J Biomed Sci

January 2025

Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, State Key Laboratory of Anti-Infective Drug Discovery and Development, School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, 510006, China.

Background: Recent studies indicate that N6-methyladenosine (mA) RNA modification may regulate ferroptosis in cancer cells, while its molecular mechanisms require further investigation.

Methods: Liquid Chromatography-Tandem Mass Spectrometry (HPLC/MS/MS) was used to detect changes in mA levels in cells. Transmission electron microscopy and flow cytometry were used to detect mitochondrial reactive oxygen species (ROS).

View Article and Find Full Text PDF

Angiogenesis begins as endothelial cells migrate, forming a sprouting tip and subsequent growth-rich stalk cells. Here, we present a protocol for transcriptomic and epigenomic analyses of tip-like cells in cultured endothelial cells. We describe steps for stimulating human umbilical vein endothelial cells (HUVECs) with vascular endothelial cell growth factor (VEGF) to generate tip-like cells.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!