Publications by authors named "Pesole G"

GenoMiner is a software tool that searches for regions of similarity between user-submitted genome or transcript sequences and user-specified whole genome assemblies. The program then identifies conserved sequence tags (CSTs) in these homologous regions and provides a prediction of their coding or non-coding nature. The analysis is carried out through three steps: (1) definition of sequence regions homologous to the query sequence in the selected target genomes by a fast BLAT alignment; (2) identification of CSTs by a more sensitive BLAST-like alignment between the query and the homologous regions in the target genomes and (3) assessment of the coding or non-coding nature of detected CSTs through the computation of a suitable coding potential score.

View Article and Find Full Text PDF

Background: Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems--hence the need to develop novel strategies.

Results: We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites.

View Article and Find Full Text PDF

This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins.

View Article and Find Full Text PDF

All Metazoan nuclear genomes underwent a continuous process of both complete and partial genetic material gain and loss. The forces modulating these events are also subject to the strict interaction between nuclear and mitochondrial (mt) genome. In this context we investigate the evolution of nuclear genes encoding proteins which target the mitochondrion, with a particular attention to genes involved in oxidative phosphorylation (OXPHOS), one of the most ancient and conserved functions.

View Article and Find Full Text PDF

In this work we describe the process that, starting with the production of human full-length-enriched cDNA libraries using the CAP-Trapper method, led us to the discovery of 342 putative new human genes. Twenty-three thousand full-length-enriched clones, obtained from various cell lines and tissues in different developmental stages, were 5'-end sequenced, allowing the identification of a pool of 5300 unique cDNAs. By comparing these sequences to various human and vertebrate nucleotide databases we found that about 40% of our clones extended previously annotated 5' ends, 662 clones were likely to represent splice variants of known genes, and finally 342 clones remained unknown, with no or poor functional annotation.

View Article and Find Full Text PDF

Comparative analysis of the ribosomal protein S3 gene (rps3) in the mitochondrial genome of Cycas with newly sequenced counterparts from Magnolia and Helianthus and available sequences from higher plants revealed that the positional clustering with the genes for ribosomal protein S19 (rps19) and L16 (rpl16) is preserved in gymnosperms. However, in contrast to the other land plant species, the rps3 gene in Cycas mitochondria is unique in possessing a second intron: rps3i2. Reverse transcription-polymerase chain reaction (RT-PCR) analysis of the transcripts generated from the rps19-rps3-rpl16 cluster in Cycas mitochondria demonstrated that the genes are cotranscribed and extensively modified by RNA editing and that both introns are efficiently spliced.

View Article and Find Full Text PDF

The control of translation is a fundamental mechanism in the regulation of gene expression. Among the cis-acting elements that play a role in translation regulation are upstream open reading frames (uORFs) and upstream AUG (uAUGs) located in the 5'UTR of mRNAs. We present here a genome-wide analysis of uAUGs and uORFs in a curated set of human and rodent mRNAs.

View Article and Find Full Text PDF

In metazoan organisms, energy production is the only example of a process that is under dual genetic control: nuclear and mitochondrial. We used a genomic approach to examine how energy genes of both the nuclear and mitochondrial genomes are coordinated, and discovered a novel genetic regulatory circuit in Drosophila melanogaster that is surprisingly simple and parsimonious. This circuit is based on a single DNA regulatory element and can explain both intra- and inter-genomic coordinated expression of genes involved in energy production, including the full complement of mitochondrial and nuclear oxidative phosphorylation genes, and the genes involved in the Krebs cycle.

View Article and Find Full Text PDF

The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark of data sets for assessing future tools.

View Article and Find Full Text PDF

The identification and study of evolutionarily conserved genomic sequences that surround disease-related genes is a valuable tool to gain insight into the functional role of these genes and to better elucidate the pathogenetic mechanisms of disease. We created the DG-CST (Disease Gene Conserved Sequence Tags) database for the identification and detailed annotation of human-mouse conserved genomic sequences that are localized within or in the vicinity of human disease-related genes. CSTs are defined as sequences that show at least 70% identity between human and mouse over a length of at least 100 bp.

View Article and Find Full Text PDF

The 5' and 3' untranslated regions of eukaryotic mRNAs play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5' and 3' untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated (and also collated as the UTRsite database) and cross-links to genomic and protein data are provided.

View Article and Find Full Text PDF

Understanding the complex mechanisms governing basic biological processes requires the characterisation of regulatory motifs modulating gene expression at transcriptional and post-transcriptional level. In particular, extent, chronology and cell-specificity of transcription are modulated by the interaction of transcription factors with their corresponding binding sites, mostly located near (or sometimes quite far away from) the transcription start site of the gene. The constantly growing amount of genomic data, complemented by other sources of information such as expression data derived from microarray experiments, has opened new opportunities to researchers in this field.

View Article and Find Full Text PDF

Summary: DNAfan (DNA Feature ANalyzer) is a tool combining sequence-filtering and pattern searching. DNAfan automatically extracts user-defined sets of sequence fragments from large sequence sets. Fragments are defined by annotated gene feature keys and co- or non-occurring patterns within the feature or close to it.

View Article and Find Full Text PDF

The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score.

View Article and Find Full Text PDF

One of the greatest challenges that modern molecular biology is facing is the understanding of the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of regulatory motifs playing key roles in the regulation of gene expression at transcriptional and post-transcriptional levels. In particular, transcription is modulated by the interaction of transcription factors with their corresponding binding sites.

View Article and Find Full Text PDF

The recent interest sparked due to the discovery of a variety of functions for non-coding RNA molecules has highlighted the need for suitable tools for the analysis and the comparison of RNA sequences. Many trans-acting non-coding RNA genes and cis-acting RNA regulatory elements present motifs, conserved both in structure and sequence, that can be hardly detected by primary sequence analysis alone. We present an algorithm that takes as input a set of unaligned RNA sequences expected to share a common motif, and outputs the regions that are most conserved throughout the sequences, according to a similarity measure that takes into account both the sequence of the regions and the secondary structure they can form according to base-pairing and thermodynamic rules.

View Article and Find Full Text PDF

Phylogenetic analysis of molecular sequence data plays an increasingly important role in clinical medicine, both in the emerging field of molecular epidemiology and in the rational design of new therapeutic agents. The aims of this review are to introduce some of the methods used to construct phylogenetic trees, to illustrate some of the pitfalls that can introduce artifactual results and to speculate on the long-term importance of this area of computational biology in clinical medicine.

View Article and Find Full Text PDF

Many eukaryotic transcripts have been reported to contain sequences similar or complementary to ribosomal RNAs. Past computational and experimental analyses have suggested a possible functional role of such rRNA-like sequences in the regulation of translation. Here, we analysed the occurrence and abundance of rRNA-like sequences in human transcripts.

View Article and Find Full Text PDF

The complete mitochondrial genome (mtDNA) of the model organism Ciona intestinalis (Urochordata, Ascidiacea) has been amplified by long-PCR using specific primers designed on putative mitochondrial transcripts identified from publicly available mitochondrial-like expressed sequence tags. The C. intestinalis mtDNA encodes 39 genes: 2 rRNAs, 13 subunits of the respiratory complexes, including ATPase subunit 8 ( atp8), and 24 tRNAs, including 2 tRNA-Met with anticodons 5'-UAU-3'and 5'-CAU-3', respectively.

View Article and Find Full Text PDF

Molecular phylogenies based on the molecular clock require the comparison of orthologous genes. Orthologous and paralogous genes usually have very different evolutionary fates. In general, orthologs keep the same functions in species, whereas, particularly over a long time span, paralogs diverge functionally and may become pseudogenes or get lost.

View Article and Find Full Text PDF

Unlabelled: GeneSyn is a software tool that allows automatic detection of conserved gene order from annotated genomes.

Availability: Available free of charge for Unix/Linux/Cygwin platforms at ftp://159.149.

View Article and Find Full Text PDF

Unlabelled: WebVar is an online resource that provides estimates of relative site variability from multiple alignments of homologous protein or nucleic acid sequences. WebVar provides a variety of graphic and textual representations of estimates, designed to assist in phylogenetic analysis.

Availability: The WebVar server is located at http://www.

View Article and Find Full Text PDF

Analyses of mitochondrial and nuclear gene sequences have often produced different mammalian tree topologies, undermining confidence in the merit of molecular approaches with respect to "traditional" morphological classification. The recent sequencing of the complete mitochondrial genomes of two additional rodents (Spalax judaei and Jaculus jaculus) and one lagomorph (Ochotona princeps) has prompted us to reinvestigate the issue. Using Bayesian phylogenetics, we found phylogenetic relationships between mammalian species highly congruent with previous results based on nuclear genes.

View Article and Find Full Text PDF

Upon cold shock Escherichia coli transiently stops growing and adapts to the new temperature (acclimatization phase). The major physiological effects of cold temperature are a decrease in membrane fluidity and the stabilization of secondary structures of RNA and DNA, which may affect the efficiencies of translation, transcription, and replication. Specific proteins are transiently induced in the acclimatization phase.

View Article and Find Full Text PDF

Mitochondrial transcripts of two ascidian species were reconstructed through sequence assembly of publicly available ESTs resembling mitochondrial DNA sequences (mt-ESTs). This strategy allowed us to analyze processing and mapping of the mitochondrial transcripts and to investigate the gene organization of a previously uncharacterized mitochondrial genome (mtDNA). This new strategy would greatly facilitate the sequencing and annotation of mtDNAs.

View Article and Find Full Text PDF