Publications by authors named "Jan Buchmann"

Tea, one of the most widely consumed beverages globally, exhibits remarkable genomic diversity in its underlying flavour and health-related compounds. In this study, we present the construction and analysis of a tea pangenome comprising a total of 11 genomes, with a focus on three newly sequenced genomes comprising the purple-leaved assamica cultivar "Zijuan", the temperature-sensitive sinensis cultivar "Anjibaicha" and the wild accession "L618" whose assemblies exhibited excellent quality scores as they profited from latest sequencing technologies. Our analysis incorporates a detailed investigation of transposon complement across the tea pangenome, revealing shared patterns of transposon distribution among the studied genomes and improved transposon resolution with long read technologies, as shown by long terminal repeat (LTR) Assembly Index analysis.

View Article and Find Full Text PDF

Despite a rapid expansion in the number of documented viruses following the advent of metagenomic sequencing, the identification and annotation of highly divergent RNA viruses remain challenging, particularly from poorly characterized hosts and environmental samples. Protein structures are more conserved than primary sequence data, such that structure-based comparisons provide an opportunity to reveal the viral 'dusk matter': viral sequences with low, but detectable, levels of sequence identity to known viruses with available protein structures. Here, we present a new open computational resource-RdRp-scan-that contains a standardized bioinformatic toolkit to identify and annotate divergent RNA viruses in metagenomic sequence data based on the detection of RNA-dependent RNA polymerase (RdRp) sequences.

View Article and Find Full Text PDF
Article Synopsis
  • Evaluating metagenomic software is crucial for enhancing the interpretation of metagenomes, and the CAMI II challenge focused on this by using complex datasets from numerous genomes and plasmids.
  • The analysis of 5,002 results from 76 software versions showed significant advancements in assembly, especially with long-read data, although challenges remained with related strains and genome recovery.
  • Findings indicated that while taxon profilers improved, they struggled with viruses and Archaea, highlighting the need for better reproducibility in clinical pathogen detection and guiding researchers in method selection based on efficiency and performance metrics.
View Article and Find Full Text PDF

Summary: We present NCBI-taxonomist-a command-line tool written in Python that collects and manages taxonomic data from the National Center for Biotechnology Information (NCBI). NCBI-taxonomist does not depend on a pre-downloaded taxonomic database but can store data locally. NCBI-taxonomist has six commands to map, collect, extract, resolve, import and group taxonomic data that can be linked together to create powerful analytical pipelines.

View Article and Find Full Text PDF

Viruses represent important test cases for data federation due to their genome size and the rapid increase in sequence data in publicly available databases. However, some consequences of previously decentralized (unfederated) data are lack of consensus or comparisons between feature annotations. Unifying or displaying alternative annotations should be a priority both for communities with robust entry representation and for nascent communities with burgeoning data sources.

View Article and Find Full Text PDF
Article Synopsis
  • There are RNA viruses that can make humans and animals sick, like Zika and dengue.
  • Scientists found new types of these viruses in Australian wildlife, including marsupials, birds, and a gecko.
  • The study showed that these viruses have been evolving along with their animal hosts, but they can sometimes jump from one species to another.
View Article and Find Full Text PDF

Proline-rich antimicrobial peptides (PrAMPs) are promising lead compounds for developing new antimicrobials; however, their narrow spectrum of action is limiting. PrAMPs kill bacteria binding to their ribosomes and inhibiting protein synthesis. In this study, 133 derivatives of the PrAMP Bac7(1-16) were synthesized to identify the crucial residues for ribosome inactivation and antimicrobial activity.

View Article and Find Full Text PDF

There is an increasing demand for accurate and fast metagenome classifiers that can not only identify bacteria, but all members of a microbial community. We used a recently developed concept in read mapping to develop a highly accurate metagenomic classification pipeline named CCMetagen. The pipeline substantially outperforms other commonly used software in identifying bacteria and fungi and can efficiently use the entire NCBI nucleotide collection as a reference to detect species with incomplete genome data from all biological kingdoms.

View Article and Find Full Text PDF

Here, we report the detection of a novel alphavirus in Australian mosquitoes, provisionally named Yada Yada virus (YYV). Phylogenetic analysis indicated that YYV belongs to the mosquito-specific alphavirus complex. The assembled genome is 11,612 nucleotides in length and encodes two open reading frames.

View Article and Find Full Text PDF

A wealth of viral data sits untapped in publicly available metagenomic data sets when it might be extracted to create a usable index for the virological research community. We hypothesized that work of this complexity and scale could be done in a hackathon setting. Ten teams comprised of over 40 participants from six countries, assembled to create a crowd-sourced set of analysis and processing pipelines for a complex biological data set in a three-day event on the San Diego State University campus starting 9 January 2019.

View Article and Find Full Text PDF

Summary: Entrezpy is a Python library that automates the querying and downloading of data from the Entrez databases at National Center for Biotechnology Information by interacting with E-Utilities. Entrezpy implements complex queries by automatically creating E-Utility parameters from the results obtained that can then be used directly in subsequent queries. Entrezpy also allows the user to cache and retrieve results locally, implements interactions with all Entrez databases as part of an analysis pipeline and adjusts parameters within an ongoing query or using prior results.

View Article and Find Full Text PDF

Overlapping genes in viruses maximize the coding capacity of their genomes and allow the generation of new genes without major increases in genome size. Despite their importance, the evolution and function of overlapping genes are often not well understood, in part due to difficulties in their detection. In addition, most bioinformatic approaches for the detection of overlapping genes require the comparison of multiple genome sequences that may not be available in metagenomic surveys of virus biodiversity.

View Article and Find Full Text PDF

The large size and high complexity of biological data can represent a major methodological challenge for the analysis and exchange of data sets between computers and applications. There has also been a substantial increase in the amount of metadata associated with biological data sets, which is being increasingly incorporated into existing data formats. Despite the existence of structured formats based on XML, biological data sets are mainly formatted using unstructured file formats, and the incorporation of metadata results in increasingly complex parsing routines such that they become more error prone.

View Article and Find Full Text PDF

There is growing interest in characterizing the viromes of diverse mammalian species, particularly in the context of disease emergence. However, little is known about virome diversity in aquatic mammals, in part due to difficulties in sampling. We characterized the virome of the exhaled breath (or blow) of the Eastern Australian humpback whale ().

View Article and Find Full Text PDF

Smallpox holds a unique position in the history of medicine. It was the first disease for which a vaccine was developed and remains the only human disease eradicated by vaccination. Although there have been claims of smallpox in Egypt, India, and China dating back millennia [1-4], the timescale of emergence of the causative agent, variola virus (VARV), and how it evolved in the context of increasingly widespread immunization, have proven controversial [4-9].

View Article and Find Full Text PDF

Current knowledge of RNA virus biodiversity is both biased and fragmentary, reflecting a focus on culturable or disease-causing agents. Here we profile the transcriptomes of over 220 invertebrate species sampled across nine animal phyla and report the discovery of 1,445 RNA viruses, including some that are sufficiently divergent to comprise new families. The identified viruses fill major gaps in the RNA virus phylogeny and reveal an evolutionary history that is characterized by both host switching and co-divergence.

View Article and Find Full Text PDF

The molecular basis of resistance and susceptibility of host plants to fire blight, a major disease threat to pome fruit production globally, is largely unknown. RNA-sequencing data from challenged and mock-inoculated flowers were analyzed to assess the susceptible response of apple to the fire blight pathogen Erwinia amylovora. In presence of the pathogen 1,080 transcripts were differentially expressed at 48 h post inoculation.

View Article and Find Full Text PDF

Why some viruses are enveloped while others lack an outer lipid bilayer is a major question in viral evolution but one that has received relatively little attention. The viral envelope serves several functions, including protecting the RNA or DNA molecule(s), evading recognition by the immune system, and facilitating virus entry. Despite these commonalities, viral envelopes come in a wide variety of shapes and configurations.

View Article and Find Full Text PDF

An epidemic of Ebola virus disease of unprecedented scale has been ongoing for more than a year in West Africa. As of 29 April 2015, there have been 26,277 reported total cases (of which 14,895 have been laboratory confirmed) resulting in 10,899 deaths. The source of the outbreak was traced to the prefecture of Guéckédou in the forested region of southeastern Guinea.

View Article and Find Full Text PDF

Background: CACTA elements are DNA transposons and are found in numerous organisms. Despite their low activity, several thousand copies can be identified in many genomes. CACTA elements transpose using a 'cut-and-paste' mechanism, which is facilitated by a DDE transposase.

View Article and Find Full Text PDF

Combining several different approaches, we have examined the structure, variability, and distribution of Tvv1 retrotransposons. Tvv1 is an unusual example of a low-copy retrotransposon metapopulation dispersed unevenly among very distant species and is promising for the development of molecular markers. Retrotransposons are ubiquitous throughout the genomes of the vascular plants, but individual retrotransposon families tend to be confined to the level of plant genus or at most family.

View Article and Find Full Text PDF

Wheat powdery mildew, Blumeria graminis forma specialis tritici, is a devastating fungal pathogen with a poorly understood evolutionary history. Here we report the draft genome sequence of wheat powdery mildew, the resequencing of three additional isolates from different geographic regions and comparative analyses with the barley powdery mildew genome. Our comparative genomic analyses identified 602 candidate effector genes, with many showing evidence of positive selection.

View Article and Find Full Text PDF

Agronomically important traits are frequently controlled by rare, genotype-specific alleles. Such genes can only be mapped in a population derived from the donor genotype. This requires the development of a specific genetic map, which is difficult in wheat because of the low level of polymorphism among elite cultivars.

View Article and Find Full Text PDF

Intergenic sequences evolve rapidly in plant genomes through a process known as genomic turnover. To investigate the influence of DNA transposons on genomic turnover, we compared 1 Mbp of orthologous genomic sequences from Brachypodium distachyon and Brachypodium sylvaticum. We found that B.

View Article and Find Full Text PDF