The X circular code is a set of 20 trinucleotides (codons) that has been identified in the protein-coding genes of most organisms (bacteria, archaea, eukaryotes, plasmids, viruses). It has been shown previously that the X circular code has the important mathematical property of being an error-correcting code. Thus, motifs of the X circular code, i.
View Article and Find Full Text PDFIn the multiomics era, comparative genomics studies based on gene repertoire comparison are increasingly used to investigate evolutionary histories of species, to study genotype-phenotype relations, species adaptation to various environments, or to predict gene function using phylogenetic profiling. However, comparisons of orthologs have highlighted the prevalence of sequence plasticity among species, showing the benefits of combining protein and subprotein levels of analysis to allow for a more comprehensive study of genotype/phenotype correlations. In this article, we introduce a new approach called BLUR (BLAST Unexpected Ranking), capable of detecting genotype divergence or specialization between two related clades at different levels: gain/loss of proteins but also of subprotein regions.
View Article and Find Full Text PDFThe origin of the genetic code remains enigmatic five decades after it was elucidated, although there is growing evidence that the code coevolved progressively with the ribosome. A number of primordial codes were proposed as ancestors of the modern genetic code, including comma-free codes such as the , , or codes ( = G or A, = C or T, = any nucleotide), and the circular code, an error-correcting code that also allows identification and maintenance of the reading frame. It was demonstrated previously that motifs of the circular code are significantly enriched in the protein-coding genes of most organisms, from bacteria to eukaryotes.
View Article and Find Full Text PDFOrthoInspector is one of the leading software suites for orthology relations inference. In this paper, we describe a major redesign of the OrthoInspector online resource along with a significant increase in the number of species: 4753 organisms are now covered across the three domains of life, making OrthoInspector the most exhaustive orthology resource to date in terms of covered species (excluding viruses). The new website integrates original data exploration and visualization tools in an ergonomic interface.
View Article and Find Full Text PDFA set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel, 2015, 2017; Arquès and Michel, 1996). This set X has an interesting mathematical property, since X is a maximal C self-complementary trinucleotide circular code (Arquès and Michel, 1996). Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the reading frame in genes.
View Article and Find Full Text PDFBackground: The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.
View Article and Find Full Text PDFNumerous mutations in each of the mitochondrial aminoacyl-tRNA synthetases (aaRSs) have been implicated in human diseases. The mutations are autosomal and recessive and lead mainly to neurological disorders, although with pleiotropic effects. The processes and interactions that drive the etiology of the disorders associated with mitochondrial aaRSs (mt-aaRSs) are far from understood.
View Article and Find Full Text PDFCilia (flagella) are important eukaryotic organelles, present in the Last Eukaryotic Common Ancestor, and are involved in cell motility and integration of extracellular signals. Ciliary dysfunction causes a class of genetic diseases, known as ciliopathies, however current knowledge of the underlying mechanisms is still limited and a better characterization of genes is needed. As cilia have been lost independently several times during evolution and they are subject to important functional variation between species, ciliary genes can be investigated through comparative genomics.
View Article and Find Full Text PDFTo investigate the complexity of alternative splicing in the retina, we sequenced and analyzed a total of 115,706 clones from normalized cDNA libraries from mouse neural retina (66,217) and rat retinal pigmented epithelium (49,489). Based upon clustering the cDNAs and mapping them with their respective genomes, the estimated numbers of genes were 9,134 for the mouse neural retina and 12,050 for the rat retinal pigmented epithelium libraries. This unique collection of retinal of messenger RNAs is maintained and accessible through a web-base server to the whole community of retinal biologists for further functional characterization.
View Article and Find Full Text PDFAn in situ hybridization (ISH) study was performed on 2000 murine genes representing around 10% of the protein-coding genes present in the mouse genome using data generated by the EURExpress consortium. This study was carried out in 25 tissues of late gestation embryos (E14.5), with a special emphasis on the developing ear and on five distinct developing sensory organs, including the cochlea, the vestibular receptors, the sensory retina, the olfactory organ, and the vibrissae follicles.
View Article and Find Full Text PDFSummary: We previously developed OrthoInspector, a package incorporating an original algorithm for the detection of orthology and inparalogy relations between different species. We have added new functionalities to the package. While its original algorithm was not modified, performing similar orthology predictions, we facilitated the prediction of very large databases (thousands of proteomes), refurbished its graphical interface, added new visualization tools for comparative genomics/protein family analysis and facilitated its deployment in a network environment.
View Article and Find Full Text PDFSummary: We present PARSEC (PAtteRn Search and Contextualization), a new open source platform for guided discovery, allowing localization and biological characterization of short genomic sites in entire eukaryotic genomes. PARSEC can search for a sequence or a degenerated pattern. The retrieved set of genomic sites can be characterized in terms of (i) conservation in model organisms, (ii) genomic context (proximity to genes) and (iii) function of neighboring genes.
View Article and Find Full Text PDFAge-related macular degeneration (AMD) is a common cause of blindness in older individuals. To accelerate the understanding of AMD biology and help design new therapies, we executed a collaborative genome-wide association study, including >17,100 advanced AMD cases and >60,000 controls of European and Asian ancestry. We identified 19 loci associated at P < 5 × 10(-8).
View Article and Find Full Text PDFTFIIH is a eukaryotic complex composed of two subcomplexes, the CAK (Cdk activating kinase) and the core-TFIIH. The core-TFIIH, composed of seven subunits (XPB, XPD, P62, P52, P44, P34, and P8), plays a crucial role in transcription and repair. Here, we performed an extended sequence analysis to establish the accurate phylogenetic distribution of the core-TFIIH in 63 eukaryotic organisms.
View Article and Find Full Text PDFBackground: Membrane trafficking involves the complex regulation of proteins and lipids intracellular localization and is required for metabolic uptake, cell growth and development. Different trafficking pathways passing through the endosomes are coordinated by the ENTH/ANTH/VHS adaptor protein superfamily. The endosomes are crucial for eukaryotes since the acquisition of the endomembrane system was a central process in eukaryogenesis.
View Article and Find Full Text PDFA major challenge in the post-genomic era is a better understanding of how human genetic alterations involved in disease affect the gene products. The KD4v (Comprehensible Knowledge Discovery System for Missense Variant) server allows to characterize and predict the phenotypic effects (deleterious/neutral) of missense variants. The server provides a set of rules learned by Induction Logic Programming (ILP) on a set of missense variants described by conservation, physico-chemical, functional and 3D structure predicates.
View Article and Find Full Text PDFThe elucidation of the complex relationships linking genotypic and phenotypic variations to protein structure is a major challenge in the post-genomic era. We present MSV3d (Database of human MisSense Variants mapped to 3D protein structure), a new database that contains detailed annotation of missense variants of all human proteins (20 199 proteins). The multi-level characterization includes details of the physico-chemical changes induced by amino acid modification, as well as information related to the conservation of the mutated residue and its position relative to functional features in the available or predicted 3D model.
View Article and Find Full Text PDFBackground: Retinal detachment often leads to a severe and permanent loss of vision and its therapeutic management remains to this day exclusively surgical. We have used surgical specimens to perform a differential analysis of the transcriptome of human retinal tissues following detachment in order to identify new potential pharmacological targets that could be used in combination with surgery to further improve final outcome.
Methodology/principal Findings: Statistical analysis reveals major involvement of the immune response in the disease.
Background: The deep-sea hydrothermal vent mussel Bathymodiolus azoricus harbors thiotrophic and methanotrophic symbiotic bacteria in its gills. While the symbiotic relationship between this hydrothermal mussel and these chemoautotrophic bacteria has been described, the molecular processes involved in the cross-talking between symbionts and host, in the maintenance of the symbiois, in the influence of environmental parameters on gene expression, and in transcriptome variation across individuals remain poorly understood. In an attempt to understand how, and to what extent, this double symbiosis affects host gene expression, we used a transcriptomic approach to identify genes potentially regulated by symbiont characteristics, environmental conditions or both.
View Article and Find Full Text PDFRod-derived Cone Viability Factor (RdCVF) is a trophic factor with therapeutic potential for the treatment of retinitis pigmentosa, a retinal disease that commonly results in blindness. RdCVF is encoded by Nucleoredoxin-like 1 (Nxnl1), a gene homologous with the family of thioredoxins that participate in the defense against oxidative stress. RdCVF expression is lost after rod degeneration in the first phase of retinitis pigmentosa, and this loss has been implicated in the more clinically significant secondary cone degeneration that often occurs.
View Article and Find Full Text PDFBackground: The retina is a multi-layered sensory tissue that lines the back of the eye and acts at the interface of input light and visual perception. Its main function is to capture photons and convert them into electrical impulses that travel along the optic nerve to the brain where they are turned into images. It consists of neurons, nourishing blood vessels and different cell types, of which neural cells predominate.
View Article and Find Full Text PDFBackground: The post-genomic era is characterised by a torrent of biological information flooding the public databases. As a direct consequence, similarity searches starting with a single query sequence frequently lead to the identification of hundreds, or even thousands of potential homologues. The huge volume of data renders the subsequent structural, functional and evolutionary analyses very difficult.
View Article and Find Full Text PDFPeroxisomes are essential organelles of eukaryotic origin, ubiquitously distributed in cells and organisms, playing key roles in lipid and antioxidant metabolism. Loss or malfunction of peroxisomes causes more than 20 fatal inherited conditions. We have created a peroxisomal database (http://www.
View Article and Find Full Text PDFThe peroxisome is an essential eukaryotic organelle, crucial for lipid metabolism and free radical detoxification, development, differentiation, and morphogenesis from yeasts to humans. Loss of peroxisomes invariably leads to fatal peroxisome biogenesis disorders in man. The evolutionary origin of peroxisomes remains unsolved; proposals for either a symbiogenetic or cellular membrane invagination event are unconclusive.
View Article and Find Full Text PDF