Nucleic Acids Res
January 2001
High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment.
View Article and Find Full Text PDFSignature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s).
View Article and Find Full Text PDFThe development of T cell effector and memory responses against foreign antigens (Ags) involves the activation, differentiation and proliferation of naive T cells expressing distinct Ag-specific TCRs. Understanding the complexity of Ag-selected TCR repertoires in individual responders in terms of the sequences selected and their relative frequencies may provide indications about how a repertoire is established and suggest ways to influence the outcome of an immune response. Most methods of repertoire analysis are unsuitable for calculating the relative in vivo frequencies of Ag-specific clones (expressing distinct TCRs) selected during an immune response, whereas sequence data obtained by single-cell PCR analysis directly reflect cell frequencies if a sufficiently large number of cells is sampled.
View Article and Find Full Text PDFThe Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes a description of the initiation site mapping data, exhaustive cross-references to the EMBL nucleotide sequence database, SWISS-PROT, TRANSFAC and other databases, as well as bibliographic references.
View Article and Find Full Text PDFThere has been steady progress in the computational analysis of transcription control regions, but current methods of predicting the gene regulatory features of noncoding sequences are still not accurate enough to be useful in automatic genome annotation. Therefore, detailed information on the expression patterns of newly sequenced genes is more likely to come from microarray-based high-throughput mRNA quantitation technologies, which have made revolutionary progress over the past few years and are now ready for genome-wide application. Future solutions to the regulatory element prediction problem may be found by the combined analysis of genome sequence and expression data.
View Article and Find Full Text PDFExcess secretion of growth hormone is a rare diagnosis in children or adolescents with tall stature. An oral glucose tolerance test (OGT) with determination of growth hormone is generally recommended to exclude this disorder. In order to test the validity of this approach in pediatric subjects, OGT tests were performed in 126 tall subjects (age: 12.
View Article and Find Full Text PDFTrends Biochem Sci
November 1998
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes description of the initiation site mapping data, cross-references to other databases, and bibliographic references.
View Article and Find Full Text PDFThe PROSITE database (http://www.expasy.ch/sprot/prosite.
View Article and Find Full Text PDFMammalian Cdc25 phosphatase is responsible for the dephosphorylation of Cdc2 and other cyclin-dependent kinases at Thr14 and Tyr15, thus activating the kinase and allowing cell cycle progression. The catalytic domain of this dual-specificity phosphatase has recently been mapped to the 180 most C-terminal amino acids. Apart from a CX3R motif, which is present at the active site of all known tyrosine phosphatases, Cdc25 does not share any obvious sequence similarity with any of those enzymes.
View Article and Find Full Text PDFMembers of the discoidin (DS) domain family, which includes the C1 and C2 repeats of blood coagulation factors V and VIII, occur in a great variety of eukaryotic proteins, most of which have been implicated in cell-adhesion or developmental processes. So far, no three-dimensional structure of a known example of this extracellular module has been determined, limiting the usefulness of identifying a new sequence as member of this family. Here, we present results of a recent search of the protein sequence database for new DS domains using generalized profiles, a sensitive multiple alignment-based search technique.
View Article and Find Full Text PDFA neurogenic disorder of acquired speech, aphasia not only is a speech disorder but also implies restriction in communicative independence. In line with the WHO's principle of attending to the consequences of disease as well, speech rehabilitation has to deal not only with aphasia in terms of speech disorder (i.e.
View Article and Find Full Text PDFThe Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of experimentally characterised eukaryotic POL II promoters. The underlying definition of a promoter is that of a transcription initiation site. All information presented in EPD results from an independent evaluation of primary experimental data shown in the biological literature.
View Article and Find Full Text PDFIL-2 stimulates expression of the alpha subunit of the high affinity IL-2 receptor (IL-2R alpha) in antigen-activated T lymphocytes, by increasing IL-2R alpha gene transcription. This response is mediated by a 52 nt IL-2 responsive enhancer (IL-2rE) that is conserved between mouse and man. The mouse enhancer is 1.
View Article and Find Full Text PDFImportant progress has been made in the past two years in the identification of Pol II promoters. For most other regulatory elements, however, current biological knowledge is still insufficient to allow the development of prediction tools. The phylogenetic-footprinting strategy, which is based on the comparative analysis of homologous sequences, is a very efficient approach to identify new unknown regulatory elements.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
April 1997
We have analyzed conserved domains in t-SNAREs [soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptors in the target membrane], proteins that are believed to be involved in the fusion of transport vesicles with their target membrane. By using a sensitive computer method, the generalized profile method, we were able to identify a new homology domain that is common in the two protein families previously identified to act as t-SNAREs, the syntaxin and SNAP-25 (synaptosome-associated protein of 25 kDa) families, which therefore constitute a new superfamily. This homology domain of approximately 60 amino acids is predicted to form a coiled-coil structure.
View Article and Find Full Text PDFComputer analysis of a conserved domain, BRCT, first described at the carboxyl terminus of the breast cancer protein BRCA1, a p53 binding protein (53BP1), and the yeast cell cycle checkpoint protein RAD9 revealed a large superfamily of domains that occur predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage. The BRCT domain consists of approximately 95 amino acid residues and occurs as a tandem repeat at the carboxyl terminus of numerous proteins, but has been observed also as a tandem repeat at the amino terminus or as a single copy. The BRCT superfamily presently includes approximately 40 nonorthologous proteins, namely, BRCA1, 53BP1, and RAD9; a protein family that consists of the fission yeast replication checkpoint protein Rad4, the oncoprotein ECT2, the DNA repair protein XRCC1, and yeast DNA polymerase subunit DPB11; DNA binding enzymes such as terminal deoxynucleotidyltransferases, deoxycytidyl transferase involved in DNA repair, and DNA-ligases III and IV; yeast multifunctional transcription factor RAP1; and several uncharacterized gene products.
View Article and Find Full Text PDFThe PROSITE database consists of biologically significant patterns and profiles formulated in such a way that with appropriate computational tools it can help to determine to which known family of protein (if any) a new sequence belongs, or which known domain(s) it contains.
View Article and Find Full Text PDFThe interleukin 2 receptor alpha-chain (IL-2R alpha) gene is a key regulator of lymphocyte proliferation. IL-2R alpha is rapidly and potently induced in T cells in response to mitogenic stimuli. Interleukin 2 (IL-2) stimulates IL-2R alpha.
View Article and Find Full Text PDFThe RFX DNA binding domain is a novel motif that has been conserved in a growing number of dimeric DNA-binding proteins, having diverse regulatory functions, in eukaryotic organisms ranging from yeasts to humans. To characterize this novel motif, we have performed a detailed dissection of the site-specific DNA binding activity of RFX1, a prototypical member of the RFX family. First, we have performed a site selection procedure to define the consensus binding site of RFX1.
View Article and Find Full Text PDF