Publications by authors named "Jakob H Havgaard"

Non-coding RNAs are key regulatory players in bacteria. Many computationally predicted non-coding RNAs, however, lack functional associations. An example is the Bacillaceae-1 RNA motif, whose Rfam model consists of two hairpin loops.

View Article and Find Full Text PDF

Reprogramming of adipocyte function in obesity is implicated in metabolic disorders like type 2 diabetes. Here, we used the pig, an animal model sharing many physiological and pathophysiological similarities with humans, to perform in-depth epigenomic and transcriptomic characterization of pure adipocyte fractions. Using a combined DNA methylation capture sequencing and Reduced Representation bisulfite sequencing (RRBS) strategy in 11 lean and 12 obese pigs, we identified in 3529 differentially methylated regions (DMRs) located at close proximity to-, or within genes in the adipocytes.

View Article and Find Full Text PDF

Background: Recent experimental efforts of CRISPR-Cas9 systems have shown that off-target binding and cleavage are a concern for the system and that this is highly dependent on the selected guide RNA (gRNA) design. Computational predictions of off-targets have been proposed as an attractive and more feasible alternative to tedious experimental efforts. However, accurate scoring of the high number of putative off-targets plays a key role for the success of computational off-targeting assessment.

View Article and Find Full Text PDF

Background: Structured non-coding RNAs play many different roles in the cells, but the annotation of these RNAs is lacking even within the human genome. The currently available computational tools are either too computationally heavy for use in full genomic screens or rely on pre-aligned sequences.

Methods: Here we present a fast and efficient method, DotcodeR, for detecting structurally similar RNAs in genomic sequences by comparing their corresponding coarse-grained secondary structure dot plots at string level.

View Article and Find Full Text PDF

Motivation: Clustering RNA sequences with common secondary structure is an essential step towards studying RNA function. Whereas structural RNA alignment strategies typically identify common structure for orthologous structured RNAs, clustering seeks to group paralogous RNAs based on structural similarities. However, existing approaches for clustering paralogous RNAs, do not take the compensatory base pair changes obtained from structure conservation in orthologous sequences into account.

View Article and Find Full Text PDF

t(8;21) acute myeloid leukemia (AML) is characterized by a translocation between chromosomes 8 and 21 and formation of a distinctive RUNX1-RUNX1T1 fusion transcript. This translocation places RUNX1T1 under control of the RUNX1 promoter leading to a pronounced upregulation of RUNX1T1 transcripts in t(8;21) AML, compared to normal hematopoietic cells. We investigated the role of highly-upregulated RUNX1T1 under the hypothesis that it acts as competing endogenous RNA (ceRNA) titrating microRNAs (miRNAs) away from their target transcripts and thus contributes to AML formation.

View Article and Find Full Text PDF

Motivation: Structured RNAs can be hard to search for as they often are not well conserved in their primary structure and are local in their genomic or transcriptomic context. Thus, the need for tools which in particular can make local structural alignments of RNAs is only increasing.

Results: To meet the demand for both large-scale screens and hands on analysis through web servers, we present a new multithreaded version of Foldalign.

View Article and Find Full Text PDF

A key aspect of RNA secondary structure prediction is the identification of novel functional elements. This is a challenging task because these elements typically are embedded in longer transcripts where the borders between the element and flanking regions have to be defined. The flanking sequences impact the folding of the functional elements both at the level of computational analyses and when the element is extracted as a transcript for experimental analysis.

View Article and Find Full Text PDF

Background: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals.

View Article and Find Full Text PDF

Simultaneous alignment and secondary structure prediction of RNA sequences is often referred to as "RNA structural alignment." A class of the methods for structural alignment is based on the principles proposed by Sankoff more than 25 years ago. The Sankoff algorithm simultaneously folds and aligns two or more sequences.

View Article and Find Full Text PDF

Growing recognition of the numerous, diverse and important roles played by non-coding RNA in all organisms motivates better elucidation of these cellular components. Comparative genomics is a powerful tool for this task and is arguably preferable to any high-throughput experimental technology currently available, because evolutionary conservation highlights functionally important regions. Conserved secondary structure, rather than primary sequence, is the hallmark of many functionally important RNAs, because compensatory substitutions in base-paired regions preserve structure.

View Article and Find Full Text PDF

Unlabelled: MicroRNAs (miRNAs) are a group of small, approximately 21 nt long, riboregulators inhibiting gene expression at a post-transcriptional level. Their most distinctive structural feature is the foldback hairpin of their precursor pre-miRNAs. Even though each pre-miRNA deposited in miRBase has its secondary structure already predicted, little is known about the patterns of structural conservation among pre-miRNAs.

View Article and Find Full Text PDF
Article Synopsis
  • Noncoding RNAs (ncRNAs) are vital in cellular functions, and there may be many undiscovered ncRNAs in mammalian genomes.
  • Current approaches to identify these ncRNAs involve computational methods that struggle with complexity, often relying on heuristics like pre-folding and pre-aligning, which have limitations.
  • The improved FOLDALIGN algorithm introduces pruning in dynamic programming, reducing memory and time usage while maintaining effectiveness, and provides researchers with a more efficient tool to discover new ncRNAs.
View Article and Find Full Text PDF

We have developed a semiautomated RNA sequence editor (SARSE) that integrates tools for analyzing RNA alignments. The editor highlights different properties of the alignment by color, and its integrated analysis tools prevent the introduction of errors when doing alignment editing. SARSE readily connects to external tools to provide a flexible semiautomatic editing environment.

View Article and Find Full Text PDF

Motivation: Single nucleotide polymorphisms (SNPs) analysis is an important means to study genetic variation. A fast and cost-efficient approach to identify large numbers of novel candidates is the SNP mining of large scale sequencing projects. The increasing availability of sequence trace data in public repositories makes it feasible to evaluate SNP predictions on the DNA chromatogram level.

View Article and Find Full Text PDF

Background: Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages.

View Article and Find Full Text PDF

Motivation: An apparent paradox in computational RNA structure prediction is that many methods, in advance, require a multiple alignment of a set of related sequences, when searching for a common structure between them. However, such a multiple alignment is hard to obtain even for few sequences with low sequence similarity without simultaneously folding and aligning them. Furthermore, it is of interest to conduct a multiple alignment of RNA sequence candidates found from searching as few as two genomic sequences.

View Article and Find Full Text PDF

Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements.

View Article and Find Full Text PDF

Foldalign is a Sankoff-based algorithm for making structural alignments of RNA sequences. Here, we present a web server for making pairwise alignments between two RNA sequences, using the recently updated version of foldalign. The server can be used to scan two sequences for a common structural RNA motif of limited size, or the entire sequences can be aligned locally or globally.

View Article and Find Full Text PDF

Motivation: Searching for non-coding RNA (ncRNA) genes and structural RNA elements (eleRNA) are major challenges in gene finding today as these often are conserved in structure rather than in sequence. Even though the number of available methods is growing, it is still of interest to pairwise detect two genes with low sequence similarity, where the genes are part of a larger genomic region.

Results: Here we present such an approach for pairwise local alignment which is based on foldalign and the Sankoff algorithm for simultaneous structural alignment of multiple sequences.

View Article and Find Full Text PDF