The rapid proliferation of genomic DNA sequences has created a significant need for software that can both focus on relatively small areas (such as within genes or promoters) and provide wide-zoom views of patterns across entire genomes. We present our DNA Motif Lexicon that enables users to perform genome-wide searches for motifs of interest and create customizable results pages, where results differ in the degree and extent of annotation. Searching for a particular motif is akin to a word search in a natural language; our motif lexicon speaks to this new time when we will increasingly rely upon DNA dictionaries that offer rich types of annotation. Indeed, the concept of "lexomics", introduced in this paper may be appropriate to the types of meta-analyses appropriate to the deciphering of regulatory information. Currently supporting five genomes, our web-based lexicon allows users to look up motifs of interest and build user-defined result pages to include the following: (1) all base pair locations where a motif is found with links to further search the "neighborhoods" near each of these locations; whether each location of the motif is genic (within) a gene, intergenic, or a bridging sequence (overlapping a gene boundary) (2) NCBI hot-links to nearest upstream and downstream genes for each location (3) statistical information about the query (4) whether the motif is a certain type of repeat (5) links for the reverse, complement and reverse-complement of the motif of interest and (6) hot-links to PubMed abstracts which mention the motif of interest. A software framework facilitates the continual development of new annotation modules. The tool is located at: http://genomics.wheatoncollege.edu/cgi-bin/lexicon.exe.

Download full-text PDF

Source

Publication Analysis

Top Keywords

motif lexicon
12
dna motif
8
motif
8
motifs interest
8
motif interest
8
dna
4
lexicon
4
lexicon cataloguing
4
cataloguing annotating
4
annotating sequences
4

Similar Publications

O-GlcNAc transferase (OGT) is an essential mammalian enzyme that glycosylates myriad intracellular proteins and cleaves the transcriptional coregulator Host Cell Factor 1 to regulate cell cycle processes. Via these catalytic activities as well as noncatalytic protein-protein interactions, OGT maintains cell homeostasis. OGT's tetratricopeptide repeat (TPR) domain is important in substrate recognition, but there is little information on how changing the TPR domain impacts its cellular functions.

View Article and Find Full Text PDF

Background: Patients with heart failure (HF) with preserved ejection fraction (HFpEF) and obesity experience a high burden of symptoms and functional impairment, and a poor quality of life. In the STEP-HFpEF trial (Research Study to Investigate How Well Semaglutide Works in People Living With Heart Failure and Obesity), once-weekly semaglutide 2.4 mg improved symptoms, physical limitations, and exercise function, and reduced inflammation and body weight.

View Article and Find Full Text PDF

Gene expression is controlled by transcription factors (TFs) that bind cognate DNA motif sequences in -regulatory elements (CREs). The combinations of DNA motifs acting within homeostasis and disease, however, are unclear. Gene expression, chromatin accessibility, TF footprinting, and H3K27ac-dependent DNA looping data were generated and a random-forest-based model was applied to identify 7,531 cell-type-specific -regulatory modules (CRMs) across 15 diploid human cell types.

View Article and Find Full Text PDF

To define the multi-cellular epigenomic and transcriptional landscape of cardiac cellular development, we generated single-cell chromatin accessibility maps of human fetal heart tissues. We identified eight major differentiation trajectories involving primary cardiac cell types, each associated with dynamic transcription factor (TF) activity signatures. We contrasted regulatory landscapes of iPSC-derived cardiac cell types and their in vivo counterparts, which enabled optimization of in vitro differentiation of epicardial cells.

View Article and Find Full Text PDF

Our understanding of the ubiquitin code has greatly evolved from conventional E1, E2 and E3 enzymes that modify Lys residues on specific substrates with a single type of ubiquitin chain to more complex processes that regulate and mediate ubiquitylation. In this Review, we discuss recently discovered endogenous mechanisms and unprecedented pathways by which pathogens rewrite the ubiquitin code to promote infection. These processes include unconventional ubiquitin modifications involving ester linkages with proteins, lipids and sugars, or ubiquitylation through a phosphoribosyl bridge involving Arg42 of ubiquitin.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!