Creating hierarchical models of protein families based on Expressed Sequence Tags: the "Sprockets" analysis pipeline.

Anal Chim Acta

University of Calgary, Faculty of Medicine, Sun Center of Excellence for Visual Genomics, 3330 Hospital Drive NW, Calgary, AB, Canada T2N 4N1.

Published: March 2006

We have created an analysis pipeline called Sprockets, which can be used to classify proteins into various hierarchical "families", and build searchable models of these families. The construction of these families is based on data from Expressed Sequence Tags (ESTs) and Coding DNA Sequences (CDSs), making Sprockets clusters especially suitable for studying gene families in organisms for which the completely sequenced genome does not (yet) exist. The pipeline consists of two main parts: pair-wise analysis and grouping of sequences with Z-score statistics, followed by hierarchical splitting of clusters into alignable protein families. Various computational and statistical techniques applied in Sprockets allow it to act like a massive and selective multiple sequence alignment engine for combining individual sequence collections and related public sequences. The end result is a database of gene Hidden Markov Models, each related to the other by three levels of similarity: secondary structure, function and evolutionary origin. For a sample 20,000 EST set from Lactuca spp., Sprockets provided a 9% improvement in mapping of function to unknown sequences over traditional pair-wise search methods and InterPro mapping.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.aca.2006.01.072DOI Listing

Publication Analysis

Top Keywords

protein families
8
families based
8
expressed sequence
8
sequence tags
8
analysis pipeline
8
families
5
creating hierarchical
4
hierarchical models
4
models protein
4
based expressed
4

Similar Publications

Genome-wide identification and expression analysis of the BBX gene family in Lagerstroemia indica grown under light stress.

Int J Biol Macromol

January 2025

Hunan Key Laboratory for Breeding of Clonally Propagated Forest Trees, Hunan Academy of Forestry, Changsha, Hunan 410004, China. Electronic address:

B-box proteins (BBX) play pivotal roles in the regulation of numerous growth and developmental processes in plants, particularly the light-mediated biosynthesis of pigments. To elucidate the role of BBX transcription factors in the anthocyanin biosynthetic pathway of Lagerstroemia indica leaves, this study identified 41 BBX genes in the L. indica genome.

View Article and Find Full Text PDF

Genome-wide analysis of GRAS gene family and functional identification of a putative development and maintenance of axillary meristematic tissue gene PlGRAS22 in Paeonia ludlowii.

Int J Biol Macromol

January 2025

School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China; Beijing Key Laboratory of Ornamental Plants Germplasm Innovation & Molecular Breeding, National Engineering Research Center for Floriculture, Engineering Research Center of Landscape Environment of Ministry of Education, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, China. Electronic address:

The GRAS gene family, is instrumental in a myriad of biological processes, including plant growth and development. Our findings revealed that Paeonia ludlowii (Stern & G.Taylor) D.

View Article and Find Full Text PDF

Deleted in malignant brain tumors 1 (DMBT1) gene relate to immune priming and phagocytosis modulation in the small abalone Haliotis diversicolor.

Comp Biochem Physiol C Toxicol Pharmacol

January 2025

Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou, Guangdong 510300, China. Electronic address:

The small abalone (Haliotis diversicolor) is an economic shellfish cultured in the south coast of China. In recent years, the frequent occurrence of the disease has led to significant mortality in abalone farms. Deleted in malignant brain tumors 1 (DMBT1), a member of the scavenger receptor cysteine-rich (SRCR) protein family, plays an important role in host defense.

View Article and Find Full Text PDF

Crystal structure of the anti-CRISPR protein AcrIE7.

Biochem Biophys Res Commun

January 2025

Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China; Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, 300070, China. Electronic address:

Bacterial adaptive immunity, driven by CRISPR-Cas systems, protects against foreign nucleic acids from mobile genetic elements (MGEs), like bacteriophages. The type I-E CRISPR-Cas system employs the Cascade (CRISPR-associated complex for antiviral defense) complex for target DNA cleavage, guided by crRNA. Anti-CRISPR (Acr) proteins, such as AcrIE7, counteract this defense by inhibiting Cascade activity.

View Article and Find Full Text PDF

Maternal exposure to deltamethrin during pregnancy and lactation impairs hippocampal learning and memory function of male offspring by ferroptosis.

Ecotoxicol Environ Saf

January 2025

Key Laboratory of Environmental Stress and Chronic Disease Control & Prevention (China Medical University), Ministry of Education, Shenyang, Liaoning, China; Department of Occupational and Environmental Health, School of Public Health, China Medical University, Shenyang, Liaoning, China. Electronic address:

Deltamethrin (DM), a broad-spectrum insecticide, is widely used in the world. It can exert direct action on the central nervous system to produce neurotoxicity. Exposure to DM can lead to iron metabolism disorder, oxidative stress and learning and memory dysfunction.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!