Despite ever-increasing accumulation of genomic data, the fundamental question of how individual genes are switched on during development, lineage-specification and differentiation is not fully answered. It is widely accepted that this involves the interaction between at least three fundamental regulatory elements: enhancers, promoters and insulators. Enhancers contain transcription factor binding sites which are bound by transcription factors (TFs) and co-factors expressed during cell fate decisions and maintain imposed patterns of activation, at least in part, via their epigenetic modification.
View Article and Find Full Text PDFIn higher eukaryotes, many genes are regulated by enhancers that are 10-10 base pairs (bp) away from the promoter. Enhancers contain transcription-factor-binding sites (which are typically around 7-22 bp), and physical contact between the promoters and enhancers is thought to be required to modulate gene expression. Although chromatin architecture has been mapped extensively at resolutions of 1 kilobase and above; it has not been possible to define physical contacts at the scale of the proteins that determine gene expression.
View Article and Find Full Text PDFβ-Thalassaemia is one of the most common monogenic diseases with no effective cure in the majority of patients. Unbalanced production of α-globin in the presence of defective synthesis of β-globin is the primary mechanism for anaemia in β-thalassaemia. Clinical genetic data accumulated over three decades have clearly demonstrated that direct suppression of α-globin and induction of γ-globin are effective in reducing the globin chain imbalance in erythroid cells hence improving the clinical outcome of patients with β-thalassaemia.
View Article and Find Full Text PDFIn the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding.
View Article and Find Full Text PDFAccurate diagnosis of rare inherited anaemias is challenging, requiring a series of complex and expensive laboratory tests. Targeted next-generation-sequencing (NGS) has been used to investigate these disorders, but the selection of genes on individual panels has been narrow and the validation strategies used have fallen short of the standards required for clinical use. Clinical-grade validation of negative results requires the test to distinguish between lack of adequate sequencing reads at the locations of known mutations and a real absence of mutations.
View Article and Find Full Text PDFTo assess factors influencing the success of whole-genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases or families across a broad spectrum of disorders in whom previous screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritization. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy.
View Article and Find Full Text PDFOur understanding of biological processes in humans is often based on examination of analogous processes in other organisms. The nematode worm Caenorhabditis elegans has been a particularly valuable model, leading to Nobel prize winning discoveries in development and genetics. Until recently, however, the worm has not been widely used as a model to study transcription due to the lack of a comprehensive catalogue of its RNA transcripts.
View Article and Find Full Text PDFBackground: Mammalian transcriptomes contain thousands of long noncoding RNAs (lncRNAs). Some lncRNAs originate from intragenic enhancers which, when active, behave as alternative promoters producing transcripts that are processed using the canonical signals of their host gene. We have followed up this observation by analyzing intergenic lncRNAs to determine the extent to which they might also originate from intergenic enhancers.
View Article and Find Full Text PDFATRX syndrome is characterized by X-linked mental retardation associated with alpha-thalassemia. The gene mutated in this disease, ATRX, encodes a plant homeodomain-like finger and a SWI2/SNF2-like ATPase motif, both of which are often found in chromatin-remodeling enzymes, but ATRX has not been characterized biochemically. By immunoprecipitation from HeLa extract, we found that ATRX is in a complex with transcription cofactor Daxx.
View Article and Find Full Text PDFHemoglobin switching is a complex process by which distinct globin chains are produced during stages of development. In an effort to characterize the process of hemoglobin switching in the zebrafish model system, we have isolated and characterized several embryonic globin genes. The embryonic and adult globin genes are found in clusters in a head-to-head configuration.
View Article and Find Full Text PDF