Background: Major advances in sequencing technologies and the sharing of data and metadata in science have resulted in a wealth of publicly available datasets. However, working with and especially curating public omics datasets remains challenging despite these efforts. While a growing number of initiatives aim to re-use previous results, these present limitations that often lead to the need for further in-house curation and processing.
View Article and Find Full Text PDFGut microbiomes of fish species consist of thousands of bacterial taxa that interact among each other, their environment, and the host. These complex networks of interactions are regulated by a diverse range of factors, yet little is known about the hierarchy of these interactions. Here, we introduce SAMBA (Structure-Learning of Aquaculture Microbiomes using a Bayesian Approach), a computational tool that uses a unified Bayesian network approach to model the network structure of fish gut microbiomes and their interactions with biotic and abiotic variables associated with typical aquaculture systems.
View Article and Find Full Text PDFBackground: Nudibranchs comprise a group of > 6000 marine soft-bodied mollusk species known to use secondary metabolites (natural products) for chemical defense. The full diversity of these metabolites and whether symbiotic microbes are responsible for their synthesis remains unexplored. Another issue in searching for undiscovered natural products is that computational analysis of genomes of uncultured microbes can result in detection of novel biosynthetic gene clusters; however, their in vivo functionality is not guaranteed which limits further exploration of their pharmaceutical or industrial potential.
View Article and Find Full Text PDFThe environmental impact of uncultured phages is shaped by their preferred life cycle (lytic or lysogenic). However, our ability to predict it is very limited. We aimed to discriminate between lytic and lysogenic phages by comparing the similarity of their genomic signatures to those of their hosts, reflecting their co-evolution.
View Article and Find Full Text PDFA major challenge in microbial ecology is to understand the principles and processes by which microbes associate and interact in community assemblages. Microbial communities in mountain glaciers are unique as first colonizers and nutrient enrichment drivers for downstream ecosystems. However, mountain glaciers have been distinctively sensitive to climate perturbations and have suffered a severe retreat over the past 40 years, compelling us to understand glacier ecosystems before their disappearance.
View Article and Find Full Text PDFThe GPRO suite is an in-progress bioinformatic project for -omics data analysis. As part of the continued growth of this project, we introduce a client- and server-side solution for comparative transcriptomics and analysis of variants. The client-side consists of two Java applications called "" and "" to manage pipelines and workflows based on the most common command line interface tools for RNA-seq and Variant-seq analysis, respectively.
View Article and Find Full Text PDFOrganisms are unique physical entities in which information is stored and continuously processed. The digital nature of DNA sequences enables the construction of a dynamic information reservoir. However, the distinction between the hardware and software components in the information flow is crucial to identify the mechanisms generating specific genomic signatures.
View Article and Find Full Text PDFRibosomal DNA (rDNA) is the genetic loci that encodes rRNA in eukaryotes. It is typically arranged as tandem repeats that vary in copy number within the same species. We have recently shown that rDNA repeats copy number in the yeast Saccharomyces cerevisiae is controlled by cell volume via a feedback circuit that senses cell volume by means of the concentration of the free upstream activator factor (UAF).
View Article and Find Full Text PDFThe generation of different types of defective viral genomes (DVG) is an unavoidable consequence of the error-prone replication of RNA viruses. In recent years, a particular class of DVGs, those containing long deletions or genome rearrangements, has gain interest due to their potential therapeutic and biotechnological applications. Identifying such DVGs in high-throughput sequencing (HTS) data has become an interesting computational problem.
View Article and Find Full Text PDFComput Struct Biotechnol J
December 2021
The human gut holds a special place in the study of different microbial environments due to growing evidence that the gut microbiota is related to host health. However, despite extensive research, there is still a lack of knowledge about the core taxa forming the gut microbiota and, moreover, available information is biased towards western microbiomes in both genome databases and most core taxa studies. To tackle these limitations, we tested a database enrichment strategy and analyzed public datasets of whole-genome shotgun data, generated from 545 fecal samples, comprising three gradients of westernization.
View Article and Find Full Text PDFHuman blood metagenomics has revealed the presence of different types of viruses in apparently healthy subjects. By far, anelloviruses constitute the viral family that is more frequently found in human blood, although amplification biases and contaminations pose a major challenge in this field. To investigate this further, we subjected pooled plasma samples from 120 healthy donors in Spain to high-speed centrifugation, RNA and DNA extraction, random amplification, and massive parallel sequencing.
View Article and Find Full Text PDFProgressive evolution, or the tendency towards increasing complexity, is a controversial issue in biology, which resolution entails a proper measurement of complexity. Genomes are the best entities to address this challenge, as they encode the historical information of a species' biotic and environmental interactions. As a case study, we have measured genome sequence complexity in the ancient phylum Cyanobacteria.
View Article and Find Full Text PDFObjective: To determine the molecular functions of genes exhibiting altered expression in the endometrium of women with uterine disorders affecting fertility.
Design: Retrospective analysis integrating case and control data from multiple cohorts with endometrium gene expression in women with uterine disorders.
Setting: Infertility research department affiliated with a university hospital.
Background: The olive tree is of particular economic interest in the Mediterranean basin. Researchers have conducted several studies on one of the most devastating disorders affecting this tree, the Verticillium wilt, which causes substantial economic losses in numerous areas. We analyzed metatranscriptomic samples taken from a previous study conducted on leaves and roots of Olea europaea that were infected with Verticillium dahliae.
View Article and Find Full Text PDFBackground: The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use.
View Article and Find Full Text PDFBackground: Reactome aims to provide bioinformatics tools for visualisation, interpretation and analysis of pathway knowledge to support basic research, genome analysis, modelling, systems biology and education. Pathway analysis methods have a broad range of applications in physiological and biomedical research; one of the main problems, from the analysis methods performance point of view, is the constantly increasing size of the data samples.
Results: Here, we present a new high-performance in-memory implementation of the well-established over-representation analysis method.
BMC Bioinformatics
February 2016
Background: The use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequencer available, the MinION™ from Oxford Nanopore Technologies, is a USB-connected, portable device that allows real-time DNA analysis. In addition, other new instruments are expected to be released soon, which promise to outperform the current short-read technologies in terms of throughput.
View Article and Find Full Text PDFUnlabelled: Genomic technologies allow laboratories to produce large-scale data sets, either through the use of next-generation sequencing or microarray platforms. To explore these data sets and obtain maximum value from the data, researchers view their results alongside all the known features of a given reference genome. To study transcriptional changes that occur under a given condition, researchers search for regions of the genome that are differentially expressed between different experimental conditions.
View Article and Find Full Text PDFUnlabelled: HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20× for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current state-of-the-art mappers.
View Article and Find Full Text PDFEukaryotic gene expression is regulated both at the transcription and the mRNA degradation levels. The implementation of functional genomics methods that allow the simultaneous measurement of transcription (TR) and degradation (DR) rates for thousands of mRNAs is a huge improvement in this field. One of the best established methods for mRNA stability determination is genomic run-on (GRO).
View Article and Find Full Text PDFBackground: In Drosophila melanogaster, dosage compensation is mediated by the action of the dosage compensation complex (DCC). How the DCC recognizes the fly X chromosome is still poorly understood. Characteristic sequence signatures at all DCC binding sites have not hitherto been found.
View Article and Find Full Text PDFBackground: The comparison of DNA sequences is a traditional problem in genomics and bioinformatics. Many new opportunities emerge due to the improvement of personal computers, allowing the implementation of novel strategies of analysis.
Findings: We describe a new program, called UVWORD, which determines the number of times that each DNA word present in a sequence (target) is found in a second sequence (source), a procedure that we have called oligonucleotide profiling.
Background: Sequencing of the genomes of several Drosophila allows for the first precise analyses of how global sequence patterns change among multiple, closely related animal species. A basic question is whether there are characteristic features that differentiate chromosomes within a species or between different species.
Results: We explored the euchromatin of the chromosomes of seven Drosophila species to establish their global patterns of DNA sequence diversity.
Background: The imprint of natural selection on gene sequences is often difficult to detect. A plethora of methods have been devised to detect genetic changes due to selective processes. However, many of those methods depend heavily on underlying assumptions regarding the mode of change of DNA sequences and often require sophisticated mathematical treatments that made them computationally slow.
View Article and Find Full Text PDFThe human gene parkin, known to cause familial Parkinson disease, as well as several other genes, likely involved in other neurodegenerative diseases or in cancer, encode proteins of the RBR family of ubiquitin ligases. Here, we describe the structural diversity of the RBR family in order to infer their functional roles. Of particular interest is a relationship detected between RBR-mediated ubiquitination and RNA metabolism: a few RBR proteins contain RNA binding domains and DEAH-box RNA helicase domains.
View Article and Find Full Text PDF