Publications by authors named "Flicek P"

The molecular events that contribute to, and result from, the in vivo binding of transcription factors to their cognate DNA sequence motifs in mammalian genomes are poorly understood. We demonstrate that variations within the DNA sequence motifs that bind the transcriptional repressor REST (NRSF) encode in vivo DNA binding affinity hierarchies that contribute to regulatory function during lineage-specific and developmental programs in fundamental ways. First, canonical sequence motifs for REST facilitate strong REST binding and control functional classes of REST targets that are common to all cell types, whilst atypical motifs participate in weak interactions and control those targets, which are cell- or tissue-specific.

View Article and Find Full Text PDF
The need for speed.

Genome Biol

July 2009

DNA sequence data are being produced at an ever-increasing rate. The Bowtie sequence-alignment algorithm uses advanced data structures to help data analysis keep pace with data generation.

View Article and Find Full Text PDF

The remarkable progress in characterizing the human genome sequence, exemplified by the Human Genome Project and the HapMap Consortium, has led to the perception that knowledge and the tools (e.g., microarrays) are sufficient for many if not most biomedical research efforts.

View Article and Find Full Text PDF

Natural antisense transcripts (NATs) are important regulators of gene expression. Recently, a link between antisense transcription and the formation of endo-siRNAs has emerged. We investigated the bi-directionally transcribed Na/phosphate cotransporter gene (Slc34a1) under the aspect of endo-siRNA processing.

View Article and Find Full Text PDF

Using chromatin immunoprecipitation combined with genomic microarrays we have identified targets of No tail (Ntl), a zebrafish Brachyury ortholog that plays a central role in mesoderm formation. We show that Ntl regulates a downstream network of other transcription factors and identify an in vivo Ntl binding site that resembles the consensus T-box binding site (TBS) previously identified by in vitro studies. We show that the notochord-expressed gene floating head (flh) is a direct transcriptional target of Ntl and that a combination of TBSs in the flh upstream region are required for Ntl-directed expression.

View Article and Find Full Text PDF

Background: While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C.

View Article and Find Full Text PDF

The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases, and other information for chordate, selected model organism and disease vector genomes.

View Article and Find Full Text PDF

An ambitious plan to collect, curate, and make accessible information on genetic variations affecting human health is beginning to be realized.

View Article and Find Full Text PDF

Recently attention has been turned to the problem of reconstructing complete ancestral sequences from large multiple alignments. Successful generation of these genome-wide reconstructions will facilitate a greater knowledge of the events that have driven evolution. We present a new evolutionary alignment modeler, called "Ortheus," for inferring the evolutionary history of a multiple alignment, in terms of both substitutions and, importantly, insertions and deletions.

View Article and Find Full Text PDF

DNA methylation is an indispensible epigenetic modification required for regulating the expression of mammalian genomes. Immunoprecipitation-based methods for DNA methylome analysis are rapidly shifting the bottleneck in this field from data generation to data analysis, necessitating the development of better analytical tools. In particular, an inability to estimate absolute methylation levels remains a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling.

View Article and Find Full Text PDF

We report a novel resource (methylation profiles of DNA, or mPod) for human genome-wide tissue-specific DNA methylation profiles. mPod consists of three fully integrated parts, genome-wide DNA methylation reference profiles of 13 normal somatic tissues, placenta, sperm, and an immortalized cell line, a visualization tool that has been integrated with the Ensembl genome browser and a new algorithm for the analysis of immunoprecipitation-based DNA methylation profiles. We demonstrate the utility of our resource by identifying the first comprehensive genome-wide set of tissue-specific differentially methylated regions (tDMRs) that may play a role in cellular identity and the regulation of tissue-specific genome function.

View Article and Find Full Text PDF
Article Synopsis
  • The draft genome sequence of the platypus reveals its unique blend of reptilian and mammalian traits, such as fur for swimming and egg-laying in females who also lactate.
  • The study shows how specific venom proteins in platypuses have evolved independently from reptiles, along with conserved milk protein genes and notable expansions in immune-related gene families.
  • This genome sequencing serves as an important resource for understanding mammalian evolution, monotreme biology, and conservation efforts.
View Article and Find Full Text PDF

The laboratory rat is one of the most extensively studied model organisms. Inbred laboratory rat strains originated from limited Rattus norvegicus founder populations, and the inherited genetic variation provides an excellent resource for the correlation of genotype to phenotype. Here, we report a survey of genetic variation based on almost 3 million newly identified SNPs.

View Article and Find Full Text PDF

The most widely used method for detecting genome-wide protein-DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first objective analysis of tiling array platforms, amplification procedures, and signal detection algorithms in a simulated ChIP-chip experiment. Mixtures of human genomic DNA and "spike-ins" comprised of nearly 100 human sequences at various concentrations were hybridized to four tiling array platforms by eight independent groups.

View Article and Find Full Text PDF

CONTRAST, a new gene-prediction algorithm that uses sophisticated machine-learning techniques, has pushed de novo prediction accuracy to new heights, and has significantly closed the gap between de novo and evidence-based methods for human genome annotation.

View Article and Find Full Text PDF

The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases and other information for chordate and selected model organism and disease vector genomes.

View Article and Find Full Text PDF

Genetic variation influences gene expression, and this variation in gene expression can be efficiently mapped to specific genomic regions and variants. Here we have used gene expression profiling of Epstein-Barr virus-transformed lymphoblastoid cell lines of all 270 individuals genotyped in the HapMap Consortium to elucidate the detailed features of genetic variation underlying gene expression variation. We find that gene expression is heritable and that differentiation between populations is in agreement with earlier small-scale studies.

View Article and Find Full Text PDF

Motivation: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling array data is complicated by the presence of non-unique sequences on the array, which increases the overall noise in the data and may lead to false positive results due to cross-hybridization. The ability to create custom microarrays using maskless array synthesis has led us to consider ways to optimize array design characteristics for improving data quality and analysis.

View Article and Find Full Text PDF

Background: The function and significance of the widespread expression of natural antisense transcripts (NATs) is largely unknown. The ability to quantitatively assess changes in NAT expression for many different transcripts in multiple samples would facilitate our understanding of this relatively new class of RNA molecules.

Results: Here, we demonstrate that standard expression analysis Affymetrix MOE430 and HG-U133 GeneChips contain hundreds of probe sets that detect NATs.

View Article and Find Full Text PDF
Article Synopsis
  • - The study reports on experiments analyzing a targeted 1% of the human genome during the ENCODE Project's pilot phase, providing crucial insights into human genome function.
  • - Findings reveal that the human genome is largely transcribed, with evidence showing that most genomic bases contribute to various types of transcripts, including those that do not code for proteins.
  • - Enhanced understanding of transcription regulation, chromatin structure, and evolutionary insights from comparisons between species help define the functional landscape of the human genome, guiding future research in genome characterization.
View Article and Find Full Text PDF

We generated high-resolution maps of histone H3 lysine 9/14 acetylation (H3ac), histone H4 lysine 5/8/12/16 acetylation (H4ac), and histone H3 at lysine 4 mono-, di-, and trimethylation (H3K4me1, H3K4me2, H3K4me3, respectively) across the ENCODE regions. Studying each modification in five human cell lines including the ENCODE Consortium common cell lines GM06990 (lymphoblastoid) and HeLa-S3, as well as K562, HFL-1, and MOLT4, we identified clear patterns of histone modification profiles with respect to genomic features. H3K4me3, H3K4me2, and H3ac modifications are tightly associated with the transcriptional start sites (TSSs) of genes, while H3K4me1 and H4ac have more widespread distributions.

View Article and Find Full Text PDF

Lists of variations in genomic DNA and their effects have been kept for some time and have been used in diagnostics and research. Although these lists have been carefully gathered and curated, there has been little standardization and coordination, complicating their use. Given the myriad possible variations in the estimated 24,000 genes in the human genome, it would be useful to have standard criteria for databases of variation.

View Article and Find Full Text PDF

The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences.

View Article and Find Full Text PDF

Background: As part of the ENCODE Genome Annotation Assessment Project (EGASP), we developed the MARS extension to the Twinscan algorithm. MARS is designed to find human alternatively spliced transcripts that are conserved in only one or a limited number of extant species. MARS is able to use an arbitrary number of informant sequences and predicts a number of alternative transcripts at each gene locus.

View Article and Find Full Text PDF

Background: We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions.

View Article and Find Full Text PDF