Background: JBrowse is a fast and full-featured genome browser built with JavaScript and HTML5. It is easily embedded into websites or apps but can also be served as a standalone web page.
Results: Overall improvements to speed and scalability are accompanied by specific enhancements that support complex interactive queries on large track sets.
Web Apollo is the first instantaneous, collaborative genomic annotation editor available on the web. One of the natural consequences following from current advances in sequencing technology is that there are more and more researchers sequencing new genomes. These researchers require tools to describe the functional features of their newly sequenced genomes.
View Article and Find Full Text PDFBackground: Visualization software can expose previously undiscovered patterns in genomic data and advance biological science.
Results: The Genoviz Software Development Kit (SDK) is an open source, Java-based framework designed for rapid assembly of visualization software applications for genomics. The Genoviz SDK framework provides a mechanism for incorporating adaptive, dynamic zooming into applications, a desirable feature of genome viewers.
Unlabelled: Experimental techniques that survey an entire genome demand flexible, highly interactive visualization tools that can display new data alongside foundation datasets, such as reference gene annotations. The Integrated Genome Browser (IGB) aims to meet this need. IGB is an open source, desktop graphical display tool implemented in Java that supports real-time zooming and panning through a genome; layout of genomic features and datasets in moveable, adjustable tiers; incremental or genome-scale data loading from remote web servers or local files; and dynamic manipulation of quantitative data via genome graphs.
View Article and Find Full Text PDFSignificant fractions of eukaryotic genomes give rise to RNA, much of which is unannotated and has reduced protein-coding potential. The genomic origins and the associations of human nuclear and cytosolic polyadenylated RNAs longer than 200 nucleotides (nt) and whole-cell RNAs less than 200 nt were investigated in this genome-wide study. Subcellular addresses for nucleotides present in detected RNAs were assigned, and their potential processing into short RNAs was investigated.
View Article and Find Full Text PDFRecently, we mapped the sites of transcription across approximately 30% of the human genome and elucidated the structures of several hundred novel transcripts. In this report, we describe a novel combination of techniques including the rapid amplification of cDNA ends (RACE) and tiling array technologies that was used to further characterize transcripts in the human transcriptome. This technical approach allows for several important pieces of information to be gathered about each array-detected transcribed region, including strand of origin, start and termination positions, and the exonic structures of spliced and unspliced coding and noncoding RNAs.
View Article and Find Full Text PDFIn order to take full advantage of the newly available public human genome sequence data and associated annotations, biologists require visualization tools that can accommodate the high frequency of alternative splicing in human genes and other complexities. In this article, we describe techniques for presenting human genomic sequence data and annotations in an interactive, graphical format, with the aim of providing developers with a guide to what features are most likely to meet biologists' needs. These techniques include: one-dimensional semantic zooming to show sequence data alongside gene structures; moveable, adjustable tiers; visual encoding of translation frame to show how alternative transcript structure affects encoded proteins; and display of protein domains in the context of genomic sequence to show how alternative splicing impacts protein structure and function.
View Article and Find Full Text PDFProc IEEE Comput Soc Bioinform Conf
May 2005
Understanding the functional significance of alternative splicing and other mechanisms that generate RNA transcript diversity is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein structure and function. To test this, a data mining technique ("DiffHit") was developed to identify and catalog genes producing protein isoforms which exhibit distinct profiles of conserved protein motifs.
View Article and Find Full Text PDFSites of transcription of polyadenylated and nonpolyadenylated RNAs for 10 human chromosomes were mapped at 5-base pair resolution in eight cell lines. Unannotated, nonpolyadenylated transcripts comprise the major proportion of the transcriptional output of the human genome. Of all transcribed sequences, 19.
View Article and Find Full Text PDFUnderstanding how alternative splicing affects gene function is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein function. To test this, high-quality exon-intron structures were deduced for over 8000 human genes, including over 1300 (17 percent) that produce multiple transcript variants.
View Article and Find Full Text PDFIn this report, we have achieved a richer view of the transcriptome for Chromosomes 21 and 22 by using high-density oligonucleotide arrays on cytosolic poly(A)(+) RNA. Conservatively, only 31.4% of the observed transcribed nucleotides correspond to well-annotated genes, whereas an additional 4.
View Article and Find Full Text PDFUsing high-density oligonucleotide arrays representing essentially all nonrepetitive sequences on human chromosomes 21 and 22, we map the binding sites in vivo for three DNA binding transcription factors, Sp1, cMyc, and p53, in an unbiased manner. This mapping reveals an unexpectedly large number of transcription factor binding site (TFBS) regions, with a minimal estimate of 12,000 for Sp1, 25,000 for cMyc, and 1600 for p53 when extrapolated to the full genome. Only 22% of these TFBS regions are located at the 5' termini of protein-coding genes while 36% lie within or immediately 3' to well-characterized genes and are significantly correlated with noncoding RNAs.
View Article and Find Full Text PDFBMC Bioinformatics
July 2002
Background: In order to take full advantage of the newly available public human genome sequence data and associated annotations, biologists require visualization tools ("genome browsers") that can accommodate the high frequency of alternative splicing in human genes and other complexities.
Results: In this article, we describe visualization techniques for presenting human genomic sequence data and annotations in an interactive, graphical format. These techniques include: one-dimensional, semantic zooming to show sequence data alongside gene structures; color-coding exons to indicate frame of translation; adjustable, moveable tiers to permit easier inspection of a genomic scene; and display of protein annotations alongside gene structures to show how alternative splicing impacts protein structure and function.