We describe an effort ("Codebook") to determine the sequence specificity of 332 putative and largely uncharacterized human transcription factors (TFs), as well as 61 control TFs. Nearly 5,000 independent experiments across multiple and assays produced motifs for just over half of the putative TFs analyzed (177, or 53%), of which most are unique to a single TF. The data highlight the extensive contribution of transposable elements to TF evolution, both in and , and identify tens of thousands of conserved, base-level binding sites in the human genome. The use of multiple assays provides an unprecedented opportunity to benchmark and analyze TF sequence specificity, function, and evolution, as further explored in accompanying manuscripts. 1,421 human TFs are now associated with a DNA binding motif. Extrapolation from the Codebook benchmarking, however, suggests that many of the currently known binding motifs for well-studied TFs may inaccurately describe the TF's true sequence preferences.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601247PMC
http://dx.doi.org/10.1101/2024.11.11.622097DOI Listing

Publication Analysis

Top Keywords

sequence specificity
12
uncharacterized human
8
human transcription
8
transcription factors
8
multiple assays
8
tfs
5
perspectives codebook
4
sequence
4
codebook sequence
4
specificity uncharacterized
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!