Bioinformatics analysis identify novel OB fold protein coding genes in C. elegans.

PLoS One

Molecular Biology and Biochemistry Department, Simon Fraser University, Burnaby, British Columbia, Canada.

Published: November 2013

Background: The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB) fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5-25%). Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n=46) compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n=344) or fruit fly D. melanogaster (n=84). Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies.

Methodology/principal Findings: This study examines the possibility that novel OB-fold coding genes exist in the worm. We developed a bioinformatics approach that uses the most sensitive sequence-sequence, sequence-profile and profile-profile similarity search methods followed by 3D-structure prediction as a filtering step to eliminate false positive candidate sequences. We have predicted 18 coding genes containing the OB-fold that have remarkably partially been characterized in C. elegans.

Conclusions/significance: This study raises the possibility that the annotation of highly divergent protein fold families can be improved in C. elegans. Similar strategies could be implemented for large scale analysis by the WormBase consortium when novel versions of the genome sequence of C. elegans, or other evolutionary related species are being released. This approach is of general interest to the scientific community since it can be used to annotate any genome.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3636199PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0062204PLOS

Publication Analysis

Top Keywords

coding genes
12
wormbase consortium
8
highly divergent
8
divergent protein
8
protein family
8
protein
5
elegans
5
bioinformatics analysis
4
analysis identify
4
novel
4

Similar Publications

Background: Genetic studies have associated thousands of enhancers with breast cancer (BC). However, the vast majority have not been functionally characterized. Thus, it remains unclear how BC-associated enhancers contribute to cancer.

View Article and Find Full Text PDF

Complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770 from coal mine of Hongcheon on Republic of Korea.

BMC Genom Data

January 2025

Department of Applied Biosciences, College of Agriculture and Life Sciences, Kyungpook National University, Daegu, 41566, Republic of Korea.

Objectives: The data were collected to obtain the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from the rhizosphere of Sasamorpha in a heavy metal-contaminated coal mine in Hongcheon, Republic of Korea. The objective was to explore the strain's genetic potential for plant growth promotion and heavy metal resistance, particularly arsenate and copper.

View Article and Find Full Text PDF

Schizophrenia is a chronic and severe mental disorder. It is currently treated with antipsychotic drugs (APD). However, APD's work only in a limited number of patients and may have cognition impairing side effects.

View Article and Find Full Text PDF

The cell adhesion molecule Leucine-Rich Repeat Transmembrane neuronal protein 2 (LRRTM2) is crucial for synapse development and function. However, our understanding of its endogenous trafficking has been limited due to difficulties in manipulating its coding sequence (CDS) using standard genome editing techniques. Instead, we replaced the entire LRRTM2 CDS by adapting a two-guide CRISPR knock-in method, enabling complete control of LRRTM2.

View Article and Find Full Text PDF

Genetic information is involved in the gradual emergence of cortical areas since the neural tube begins to form, shaping the heterogeneous functions of neural circuits in the human brain. Informed by invasive tract-tracing measurements, the cortex exhibits marked interareal variation in connectivity profiles, revealing the heterogeneity across cortical areas. However, it remains unclear about the organizing principles possibly shared by genetics and cortical wiring to manifest the spatial heterogeneity across cortex.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!