trEST, trGEN and Hits: access to databases of predicted protein sequences.

Nucleic Acids Res

Swiss Institute of Bioinformatics, Ludwig Institute for Cancer Research, Chemin des Boveresses 155, CH-1066, Epalinges s/Lausanne, Switzerland.

Published: January 2001

High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits. isb-sib.ch).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC29852PMC
http://dx.doi.org/10.1093/nar/29.1.148DOI Listing

Publication Analysis

Top Keywords

trest trgen
12
protein sequences
12
sequence classes
8
sequences
6
trgen hits
4
hits access
4
access databases
4
databases predicted
4
protein
4
predicted protein
4

Similar Publications

trome, trEST and trGEN: databases of predicted protein sequences.

Nucleic Acids Res

January 2004

Swiss Institute of Bioinformatics, Ludwig Institute for Cancer Research, Chemin des Boveresses 155, 1066 Epalinges s/Lausanne, Switzerland.

We previously introduced two new protein databases (trEST and trGEN) of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Here, we present the updates made on these two databases plus a new database (trome), which uses alignments of EST data to HTG or full genomes to generate virtual transcripts and coding sequences. This new database is of higher quality and since it contains the information in a much denser format it is of much smaller size.

View Article and Find Full Text PDF

trEST, trGEN and Hits: access to databases of predicted protein sequences.

Nucleic Acids Res

January 2001

Swiss Institute of Bioinformatics, Ludwig Institute for Cancer Research, Chemin des Boveresses 155, CH-1066, Epalinges s/Lausanne, Switzerland.

High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!