RNA polymerase (Pol) III synthesizes the tRNAs, the 5S ribosomal RNA and a small number of untranslated RNAs. In vitro, it also transcribes short interspersed nuclear elements (SINEs). We investigated the distribution of Pol III and its associated transcription factors on the genome of mouse embryonic stem cells using a highly specific tandem ChIP-Seq method.
View Article and Find Full Text PDFWe have studied a genome-wide set of single-nucleotide polymorphism (SNP) allele frequency measures for African-American, East Asian, and European-American samples. For this analysis we derived a simple, closed mathematical formulation for the spectrum of expected allele frequencies when the sampled populations have experienced nonstationary demographic histories. The direct calculation generates the spectrum orders of magnitude faster than coalescent simulations do and allows us to generate spectra for a large number of alternative histories on a multidimensional parameter grid.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2003
Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome.
View Article and Find Full Text PDFA computational method was developed for delineating connected gene neighborhoods in bacterial and archaeal genomes. These gene neighborhoods are not typically present, in their entirety, in any single genome, but are held together by overlapping, partially conserved gene arrays. The procedure was applied to comparing the orders of orthologous genes, which were extracted from the database of Clusters of Orthologous Groups of proteins (COGs), in 31 prokaryotic genomes and resulted in the identification of 188 clusters of gene arrays, which included 1001 of 2890 COGs.
View Article and Find Full Text PDFAn artificial neural network (ANN) solution is described for the recognition of domains in protein sequences. A query sequence is first compared to a reference database of domain sequences by use of and the output data, encoded in the form of six parameters, are forwarded to feed-forward artificial neural networks with six input and six hidden units with sigmoidal transfer function. The recognition is based on the distribution of scores precomputed for the known domain groups in a database versus database comparison.
View Article and Find Full Text PDFBioinformatics
December 2000
A simple heuristic scoring method is described for assigning sequences to known domain types based on BLAST search outputs. The scoring is based on the score distribution of the known domain groups determined from a database versus database comparison and is directly applicable to BLAST output processing.
View Article and Find Full Text PDFNucleic Acids Res
January 2001
SBASE 8.0 is the eighth release of the SBASE library of protein domain sequences that contains 294 898 annotated structural, functional, ligand-binding and topogenic segments of proteins, cross-referenced to most major sequence databases and sequence pattern collections. The entries are clustered into over 2005 statistically validated domain groups (SBASE-A) and 595 non-validated groups (SBASE-B), provided with several WWW-based search and browsing facilities for online use.
View Article and Find Full Text PDFSBASE 7.0 is the seventh release of the SBASE protein domain library sequences that contains 237 937 annotated structural, functional, ligand-binding and topogenic segments of proteins, cross-referenced to all major sequence databases and sequence pattern collections. The entries are clustered into over 1811 groups and are provided with two WWW-based search facilities for on-line use.
View Article and Find Full Text PDFResults: A WWW server for protein domain homology prediction, based on BLAST search and a simple data-mining algorithm (Hegyi,H. and Pongor,S. (1993) Comput.
View Article and Find Full Text PDFThe sixth release of the SBASE protein domain library sequences contains 130 703 annotated and crossreferenced entries corresponding to structural, functional, ligand-binding and topogenic segments of proteins. The entries were grouped based on standard names (2312 groups) and futher classified on the basis of the BLAST similarity (2463 clusters). Automated searching with BLAST and a new sequence-plot representation of local domain similarities are available at the WWW-server http://www.
View Article and Find Full Text PDFSBASE 5.0 is the fifth release of SBASE, a collection of annotated protein domain sequences that represent various structural, functional, ligand-binding and topogenic segments of proteins. SBASE was designed to facilitate the detection of functional homologies and can be searched with standard database-search programs.
View Article and Find Full Text PDFSBASE 4.0 is the fourth release of SBASE, a collection of annotated protein domain sequences that represent various structural, functional, ligand binding and topogenic segments of proteins. SBASE was designed to facilitate the detection of functional homologies and can be searched with standard database search tools, such as FASTA and BLAST3.
View Article and Find Full Text PDFNucleic Acids Res
September 1994
SBASE 3.0 is the third release of SBASE, a collection of annotated protein domain sequences. SBASE entries represent various structural, functional, ligand-binding and topogenic segments of proteins as defined by their publishing authors.
View Article and Find Full Text PDFRecent studies indicate that alphaherpesviruses express latency associated transcripts (LATs) from the antisense strand of immediate early (IE) genes of the viral genome. It has been discussed that LATs containing extended open reading frames (ORFs) might be translated into protein products. We found that a salient feature of some herpesvirus DNAs is a high GC preference at the third codon positions.
View Article and Find Full Text PDFRecent evidence indicates that alphaherpesviruses express latency associated transcripts (LATs) from the antisense strand of immediate-early (IE) genes of the viral genome. It has been suggested that LATs containing extended open reading frames (ORFs), might be translated into (a) protein product(s). We found that a salient feature of some herpesvirus DNAs is a high GC preference at the third codon positions.
View Article and Find Full Text PDFThe amino acid sequence of the 27 kDa protein responsible for the haemolytic activity of Bacillus thuringiensis subsp. israelensis toxin has been analysed by secondary structure prediction, helical wheel/net diagrams and molecular mechanics calculations. We found that segment 116-126 presumably forms a strongly amphiphilic alpha-helix.
View Article and Find Full Text PDF