RNA sequencing (RNA-seq) analysis of virus-infected host cells enables researchers to study a wide range of phenomena involving host-virus interactions. This includes genomic analysis of the viral population itself, as well as analysis of the transcriptional dynamics of the virus and host during infection. In this chapter, we provide a guide for researchers interested in performing RNA-seq data analysis of virus-infected host cells or cell lines.
View Article and Find Full Text PDFThe analysis and comparison of gene neighborhoods is a powerful approach for exploring microbial genome structure, function, and evolution. Although numerous tools exist for genome visualization and comparison, genome exploration across large genomic databases or user-generated datasets remains a challenge. Here, we introduce AnnoView, a web server designed for interactive exploration of gene neighborhoods across the bacterial and archaeal tree of life.
View Article and Find Full Text PDFThe analysis of microbial genomes from human archaeological samples offers a historic snapshot of ancient pathogens and provides insights into the origins of modern infectious diseases. Here, we analyze metagenomic datasets from 38 human archaeological samples and identify bacterial genomic sequences related to modern-day Clostridium tetani, which produces the tetanus neurotoxin (TeNT) and causes the disease tetanus. These genomic assemblies had varying levels of completeness, and a subset of them displayed hallmarks of ancient DNA damage.
View Article and Find Full Text PDFThe terminal electron acceptor of most aerobic respiratory chains, cytochrome c oxidase (COX), has been highly conserved throughout evolution, from aerobic prokaryotes to complex eukaryotes. Oxygen metabolism in parasitic helminths differs significantly from that of most aerobic eukaryotes, as these organisms can switch between aerobic and anaerobic metabolisms throughout their life cycles. Early studies suggested a lack of COX activity in certain parasitic helminths, and the role of COX in helminth mitochondria remains unclear.
View Article and Find Full Text PDFPhycodnaviridae are large double-stranded DNA viruses, which facilitate studies of host-virus interactions and co-evolution due to their prominence in algal infection and their role in the life cycle of algal blooms. However, the genomic interpretation of these viruses is hampered by a lack of functional information, stemming from the surprising number of hypothetical genes of unknown function. It is also unclear how many of these genes are widely shared within the clade.
View Article and Find Full Text PDFBackground: A substantial fraction of genes identified within bacterial genomes encode proteins of unknown function. Identifying which of these proteins represent potential virulence factors, and mapping their key virulence determinants, is a challenging but important goal.
Results: To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in pathogenic versus non-pathogenic species, taxonomic distribution, relative abundance in metagenomic datasets, and other factors.
Motivation: Statistical detection of co-occurring genes across genomes, known as 'phylogenetic profiling', is a powerful bioinformatic technique for inferring gene-gene functional associations. However, this can be a challenging task given the size and complexity of phylogenomic databases, difficulty in accounting for phylogenetic structure, inconsistencies in genome annotation and substantial computational requirements.
Results: We introduce PhyloCorrelate-a computational framework for gene co-occurrence analysis across large phylogenomic datasets.
S-layers are paracrystalline proteinaceous lattices that surround prokaryotic cells, forming a critical interface between the cells and their extracellular environment. Here, we report the discovery of a novel S-layer protein present in the Gram-negative marine organism, Pseudoalteromonas tunicata D2. An uncharacterized protein (EAR28894) was identified as the most abundant protein in planktonic cultures and biofilms.
View Article and Find Full Text PDFAlthough gene-finding in bacterial genomes is relatively straightforward, the automated assignment of gene function is still challenging, resulting in a vast quantity of hypothetical sequences of unknown function. But how prevalent are hypothetical sequences across bacteria, what proportion of genes in different bacterial genomes remain unannotated, and what factors affect annotation completeness? To address these questions, we surveyed over 27 000 bacterial genomes from the Genome Taxonomy Database, and measured genome annotation completeness as a function of annotation method, taxonomy, genome size, 'research bias' and publication date. Our analysis revealed that 52 and 79 % of the average bacterial proteome could be functionally annotated based on protein and domain-based homology searches, respectively.
View Article and Find Full Text PDFThe molecular evolution of virulence factors is a central theme in our understanding of bacterial pathogenesis and host-microbe interactions. Using bioinformatics and genome data mining, recent studies have shed light on the evolution of important virulence factor families and the mechanisms by which they have adapted and diversified in function. This perspective highlights three complementary approaches useful for studying the molecular evolution of virulence factors: identification and analysis of virulence factor homologs, detection of adaptations or functional shifts, and computational prediction of novel virulence factor families.
View Article and Find Full Text PDFBacterial genomics has revolutionized our understanding of the microbial tree of life; however, mapping and visualizing the distribution of functional traits across bacteria remains a challenge. Here, we introduce AnnoTree-an interactive, functionally annotated bacterial tree of life that integrates taxonomic, phylogenetic and functional annotation data from over 27 000 bacterial and 1500 archaeal genomes. AnnoTree enables visualization of millions of precomputed genome annotations across the bacterial and archaeal phylogenies, thereby allowing users to explore gene distributions as well as patterns of gene gain and loss in prokaryotes.
View Article and Find Full Text PDFABC transporters are conserved in prokaryotes and eukaryotes, with humans expressing 48 transporters divided into 7 classes (ABCA, ABCB, ABCC, ABCD, ABDE, ABCF, and ABCG). Throughout the human body, ABC transporters regulate cAMP levels, chloride secretion, lipid transport, and anti-oxidant responses. We used a bioinformatic approach complemented with in vitro experimental methods for validation of the 48 known human ABC transporters in airway epithelial cells using bronchial epithelial cell gene expression datasets available in NCBI GEO from well-characterized patient populations of healthy subjects and individuals that smoke cigarettes, or have been diagnosed with COPD or asthma, with validation performed in Calu-3 airway epithelial cells.
View Article and Find Full Text PDFMicrobiol Resour Announc
October 2018
We sequenced wastewater effluent from the Agricultural Research Council-Animal Production in South Africa that conducts studies on livestock health and farm ecology. Thauera, Oscillibacter, and Pseudomonas were the most abundant genera within the community. Thirty-one different antibiotic resistance genes were identified, 10 of which are associated with tetracycline resistance.
View Article and Find Full Text PDFCurr Opin Struct Biol
June 2016
Large-scale sequence and structural data is a goldmine of novel proteins, but how can this data be effectively mined for new functions? Here, we review protein function prediction methods and recent studies that apply these methods to discover new functionality. Core approaches include sequence-based homology detection, phylogenetic analysis, structural bioinformatics, and inference of functional associations using genomic context and related methods. With such a wide range of approaches, sequences may reveal new functionality regardless of their similarity to a characterized reference.
View Article and Find Full Text PDFBackground: Metagenomes provide access to the taxonomic composition and functional capabilities of microbial communities. Although metagenomic analysis methods exist for estimating overall community composition or metabolic potential, identifying specific taxa that encode specific functions or pathways of interest can be more challenging. Here we present MetAnnotate, which addresses the common question: "which organisms perform my function of interest within my metagenome(s) of interest?" MetAnnotate uses profile hidden Markov models to analyze shotgun metagenomes for genes and pathways of interest, classifies retrieved sequences either through a phylogenetic placement or best hit approach, and enables comparison of these profiles between metagenomes.
View Article and Find Full Text PDFPredicted open reading frames (ORFs) that lack detectable homology to known proteins are termed ORFans. Despite their prevalence in metagenomes, the extent to which ORFans encode real proteins, the degree to which they can be annotated, and their functional contributions, remain unclear. To gain insights into these questions, we applied sensitive remote-homology detection methods to functionally analyze ORFans from soil, marine, and human gut metagenome collections.
View Article and Find Full Text PDF