Publications by authors named "Satria Kautsar"

Secondary metabolites are small molecules produced by all corners of life, often with specialized bioactive functions with clinical and environmental relevance. Secondary metabolite biosynthetic gene clusters (BGCs) can often be identified within DNA sequences by various sequence similarity tools, but determining the exact functions of genes in the pathway and predicting their chemical products can often only be done by careful, manual comparative analysis. To facilitate this, we report the first release of the secondary metabolism collaboratory (SMC), which aims to provide a comprehensive, tool-agnostic repository of BGC sequence data drawn from all publicly available and user-submitted bacterial and archaeal genome and contig sources.

View Article and Find Full Text PDF

Secondary metabolites are compounds not essential for an organism's development, but provide significant ecological and physiological benefits. These compounds have applications in medicine, biotechnology and agriculture. Their production is encoded in biosynthetic gene clusters (BGCs), groups of genes collectively directing their biosynthesis.

View Article and Find Full Text PDF

Actinobacteria, the bacterial phylum most renowned for natural product discovery, has been established as a valuable source for drug discovery and biotechnology but is underrepresented within accessible genome and strain collections. Herein, we introduce the Natural Products Discovery Center (NPDC), featuring 122,449 strains assembled over eight decades, the genomes of the first 8490 NPDC strains (7142 Actinobacteria), and the online NPDC Portal making both strains and genomes publicly available. A comparative survey of RefSeq and NPDC Actinobacteria highlights the taxonomic and biosynthetic diversity within the NPDC collection, including three new genera, hundreds of new species, and ~7000 new gene cluster families.

View Article and Find Full Text PDF

Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature.

View Article and Find Full Text PDF

Microbial specialised metabolism is full of valuable natural products that are applied clinically, agriculturally, and industrially. The genes that encode their biosynthesis are often physically clustered on the genome in biosynthetic gene clusters (BGCs). Many BGCs consist of multiple groups of co-evolving genes called sub-clusters that are responsible for the biosynthesis of a specific chemical moiety in a natural product.

View Article and Find Full Text PDF

With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC.

View Article and Find Full Text PDF

Natural microbial communities are phylogenetically and metabolically diverse. In addition to underexplored organismal groups, this diversity encompasses a rich discovery potential for ecologically and biotechnologically relevant enzymes and biochemical compounds. However, studying this diversity to identify genomic pathways for the synthesis of such compounds and assigning them to their respective hosts remains challenging.

View Article and Find Full Text PDF

Bacterial specialized metabolites are a proven source of antibiotics and cancer therapies, but whether we have sampled all the secondary metabolite chemical diversity of cultivated bacteria is not known. We analysed ~170,000 bacterial genomes and ~47,000 metagenome assembled genomes (MAGs) using a modified BiG-SLiCE and the new clust-o-matic algorithm. We estimate that only 3% of the natural products potentially encoded in bacterial genomes have been experimentally characterized.

View Article and Find Full Text PDF

Many bioactive plant cyclic peptides form side-chain-derived macrocycles. Lyciumins, cyclic plant peptides with tryptophan macrocyclizations, are ribosomal peptides (RiPPs) originating from repetitive core peptide motifs in precursor peptides with plant-specific BURP (BNM2, USP, RD22 and PG1beta) domains, but the biosynthetic mechanism for their formation has remained unknown. Here, we characterize precursor-peptide BURP domains as copper-dependent autocatalytic peptide cyclases and use a combination of tandem mass spectrometry-based metabolomics and plant genomics to systematically discover five BURP-domain-derived plant RiPP classes, with mono- and bicyclic structures formed via tryptophans and tyrosines, from botanical collections.

View Article and Find Full Text PDF

Background: Genome mining for biosynthetic gene clusters (BGCs) has become an integral part of natural product discovery. The >200,000 microbial genomes now publicly available hold information on abundant novel chemistry. One way to navigate this vast genomic diversity is through comparative analysis of homologous BGCs, which allows identification of cross-species patterns that can be matched to the presence of metabolites or biological activities.

View Article and Find Full Text PDF

Microorganisms produce natural products that are frequently used in the development of antibacterial, antiviral, and anticancer drugs, pesticides, herbicides, or fungicides. In recent years, genome mining has evolved into a prominent method to access this potential. antiSMASH is one of the most popular tools for this task.

View Article and Find Full Text PDF

Computational analysis of biosynthetic gene clusters (BGCs) has revolutionized natural product discovery by enabling the rapid investigation of secondary metabolic potential within microbial genome sequences. Grouping homologous BGCs into Gene Cluster Families (GCFs) facilitates mapping their architectural and taxonomic diversity and provides insights into the novelty of putative BGCs, through dereplication with BGCs of known function. While multiple databases exist for exploring BGCs from publicly available data, no public resources exist that focus on GCF relationships.

View Article and Find Full Text PDF

Covering: 2010-2020The digital revolution is driving significant changes in how people store, distribute, and use information. With the advent of new technologies around linked data, machine learning and large-scale network inference, the natural products research field is beginning to embrace real-time sharing and large-scale analysis of digitized experimental data. Databases play a key role in this, as they allow systematic annotation and storage of data for both basic and advanced applications.

View Article and Find Full Text PDF

Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity.

View Article and Find Full Text PDF

Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases.

View Article and Find Full Text PDF

Plants produce a vast diversity of specialized metabolites, which play important roles in the interactions with their microbiome, as well as with animals and other plants. Many such molecules have valuable biological activities that render them (potentially) useful as medicines, flavors and fragrances, nutritional ingredients, or cosmetics. Recently, plant scientists have discovered that the genes for many biosynthetic pathways for the production of such specialized metabolites are physically clustered on the chromosome within biosynthetic gene clusters (BGCs).

View Article and Find Full Text PDF

Sesterterpenoids are a rare terpene class harboring untapped chemodiversity and bioactivities. Their structural diversity originates primarily from the scaffold-generating sesterterpene synthases (STSs). In fungi, all six known STSs are bifunctional, containing C-terminal -prenyltransferase (PT) and N-terminal terpene synthase (TPS) domains.

View Article and Find Full Text PDF

Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding the production of such compounds. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' has assisted researchers in efficiently performing this, both as a web server and a standalone tool.

View Article and Find Full Text PDF

Plant specialized metabolites are chemically highly diverse, play key roles in host-microbe interactions, have important nutritional value in crops and are frequently applied as medicines. It has recently become clear that plant biosynthetic pathway-encoding genes are sometimes densely clustered in specific genomic loci: biosynthetic gene clusters (BGCs). Here, we introduce plantiSMASH, a versatile online analysis platform that automates the identification of candidate plant BGCs.

View Article and Find Full Text PDF