Uncultivated Bacteria and Archaea account for the vast majority of species on Earth, but obtaining their genomes directly from the environment, using shotgun sequencing, has only become possible recently. To realize the hope of capturing Earth's microbial genetic complement and to facilitate the investigation of the functional roles of specific lineages in a given ecosystem, technologies that accelerate the recovery of high-quality genomes are necessary. We present a series of analysis steps and data products for the extraction of high-quality metagenome-assembled genomes (MAGs) from microbiomes using the U.
View Article and Find Full Text PDFis a striking and unique example of an insect that can survive almost complete desiccation. Its genome and a set of dehydration-rehydration transcriptomes, together with the genome of (a congeneric desiccation-sensitive midge), were recently released. Here, using published and newly generated datasets reflecting detailed transcriptome changes during anhydrobiosis, as well as a developmental series, we show that the TCTAGAA DNA motif, which closely resembles the binding motif of the heat shock transcription activator (Hsf), is significantly enriched in the promoter regions of desiccation-induced genes in , such as genes encoding late embryogenesis abundant (LEA) proteins, thioredoxins, or trehalose metabolism-related genes, but not in Unlike , has double TCTAGAA sites upstream of the Hsf gene itself, which is probably responsible for the stronger activation of Hsf in during desiccation compared with To confirm the role of Hsf in desiccation-induced gene activation, we used the Pv11 cell line, derived from embryo.
View Article and Find Full Text PDFTranscripts often harbor RNA elements, which regulate cell processes co- or post-transcriptionally. The functions of many regulatory RNA elements depend on their structure, thus it is important to determine the structure as well as to scan genomes for structured elements. State of the art ab initio approaches to predict structured RNAs rely on DNA sequence analysis.
View Article and Find Full Text PDFBackground: Pseudogymnoascus spp. is a wide group of fungi lineages in the family Pseudorotiaceae including an aggressive pathogen of bats P. destructans.
View Article and Find Full Text PDFSplice sites (SSs) are short sequences that are crucial for proper mRNA splicing in eukaryotic cells, and therefore can be expected to be shaped by strong selection. Nevertheless, in mammals and in other intron-rich organisms, many of the SSs often involve nonconsensus (Nc), rather than consensus (Cn), nucleotides, and beyond the two critical nucleotides, the SSs are not perfectly conserved between species. Here, we compare the SS sequences between primates, and between Drosophila fruit flies, to reveal the pattern of selection acting at SSs.
View Article and Find Full Text PDFBackground: Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches).
View Article and Find Full Text PDFBackground: Genlisea aurea (Lentibulariaceae) is a carnivorous plant with unusually small genome size - 63.6 Mb - one of the smallest known among higher plants. Data on the genome sizes and the phylogeny of Genlisea suggest that this is a derived state within the genus.
View Article and Find Full Text PDFProper splicing is often crucial for gene functioning and its disruption may be strongly deleterious. Nevertheless, even the essential for splicing canonical dinucleotides of the splice sites are often polymorphic. Here, we use data from The 1000 Genomes Project to study single-nucleotide polymorphisms (SNPs) in the canonical dinucleotides.
View Article and Find Full Text PDFIdentification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory 'grammar', or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices.
View Article and Find Full Text PDFMembrane proteins perform a number of crucial functions as transporters, receptors, and components of enzyme complexes. Identification of membrane proteins and prediction of their topology is thus an important part of genome annotation. We present here an overview of transmembrane segments in protein sequences, summarize data from large-scale genome studies, and report results of benchmarking of several popular internet servers.
View Article and Find Full Text PDFTransmembrane transport is an essential component of the cell life. Many genes encoding known or putative transport proteins are found in bacterial genomes. In most cases their substrate specificity is not experimentally determined and only approximately predicted by comparative genomic analysis.
View Article and Find Full Text PDFAligned amino acid sequences of three functionally independent samples of transmembrane (TM) transport proteins have been analyzed. The concept of TM-kernel is proposed as the most probable transmembrane region of a sequence. The average amino acid composition of TM-kernels differs from the published amino acid composition of transmembrane segments.
View Article and Find Full Text PDF