Specialized or secondary metabolites are small molecules of biological origin, often showing potent biological activities with applications in agriculture, engineering and medicine. Usually, the biosynthesis of these natural products is governed by sets of co-regulated and physically clustered genes known as biosynthetic gene clusters (BGCs). To share information about BGCs in a standardized and machine-readable way, the Minimum Information about a Biosynthetic Gene cluster (MIBiG) data standard and repository was initiated in 2015.
View Article and Find Full Text PDFCovering: 2014 to 2023 for metabolomics, 2002 to 2023 for information visualizationLC-MS/MS-based untargeted metabolomics is a rapidly developing research field spawning increasing numbers of computational metabolomics tools assisting researchers with their complex data processing, analysis, and interpretation tasks. In this article, we review the entire untargeted metabolomics workflow from the perspective of information visualization, visual analytics and visual data integration. Data visualization is a crucial step at every stage of the metabolomics workflow, where it provides core components of data inspection, evaluation, and sharing capabilities.
View Article and Find Full Text PDFThe discovery and identification of molecules in biological and environmental samples is crucial for advancing biomedical and chemical sciences. Tandem mass spectrometry (MS/MS) is the leading technique for high-throughput elucidation of molecular structures. However, decoding a molecular structure from its mass spectrum is exceptionally challenging, even when performed by human experts.
View Article and Find Full Text PDFPlants have evolved complex bouquets of specialized natural products that are utilized in medicine, agriculture, and industry. Untargeted natural product discovery has benefitted from growing plant omics data resources. Yet, plant genome complexity limits the identification and curation of biosynthetic pathways via single omics.
View Article and Find Full Text PDFNatural products are a sustainable resource for drug discovery, but their identification in complex mixtures remains a daunting task. We present an automated pipeline that compares, harmonizes and ranks the annotations of LC-HRMS data by different tools. When applied to 7,400 extracts derived from 6,566 strains belonging to 86 actinomycete genera, it yielded 150,000 molecules after processing over 50 million MS features.
View Article and Find Full Text PDFSummary: Computational metabolomics workflows have revolutionized the untargeted metabolomics field. However, the organization and prioritization of metabolite features remains a laborious process. Organizing metabolomics data is often done through mass fragmentation-based spectral similarity grouping, resulting in feature sets that also represent an intuitive and scientifically meaningful first stage of analysis in untargeted metabolomics.
View Article and Find Full Text PDFFeature-based molecular networking (FBMN) is a popular analysis approach for liquid chromatography-tandem mass spectrometry-based non-targeted metabolomics data. While processing liquid chromatography-tandem mass spectrometry data through FBMN is fairly streamlined, downstream data handling and statistical interrogation are often a key bottleneck. Especially users new to statistical analysis struggle to effectively handle and analyze complex data matrices.
View Article and Find Full Text PDFArtificial intelligence (AI) is accelerating how we conduct science, from folding proteins with AlphaFold and summarizing literature findings with large language models, to annotating genomes and prioritizing newly generated molecules for screening using specialized software. However, the application of AI to emulate human cognition in natural product research and its subsequent impact has so far been limited. One reason for this limited impact is that available natural product data is multimodal, unbalanced, unstandardized, and scattered across many data repositories.
View Article and Find Full Text PDFThe cell painting (CP) assay has emerged as a potent imaging-based high-throughput phenotypic profiling (HTPP) tool that provides comprehensive input data for prediction of compound activities and potential hazards in drug discovery and toxicology. CP enables the rapid, multiplexed investigation of various molecular mechanisms for thousands of compounds at the single-cell level. The resulting large volumes of image data provide great opportunities but also pose challenges to image and data analysis routines as well as property prediction models.
View Article and Find Full Text PDFMass spectral libraries have proven to be essential for mass spectrum annotation, both for library matching and training new machine learning algorithms. A key step in training machine learning models is the availability of high-quality training data. Public libraries of mass spectrometry data that are open to user submission often suffer from limited metadata curation and harmonization.
View Article and Find Full Text PDFIntroduction: The chemical classification of Cannabis is typically confined to the cannabinoid content, whilst Cannabis encompasses diverse chemical classes that vary in abundance among all its varieties. Hence, neglecting other chemical classes within Cannabis strains results in a restricted and biased comprehension of elements that may contribute to chemical intricacy and the resultant medicinal qualities of the plant.
Objectives: Thus, herein, we report a computational metabolomics study to elucidate the Cannabis metabolic map beyond the cannabinoids.
Effective visualization of small molecules is paramount in conveying concepts and results in cheminformatics. Scalable vector graphics (SVG) are preferred for creating such visualizations, as SVGs can be easily altered in post-production and exported to other formats. A wide spectrum of software applications already exist that can visualize molecules, and customize these visualizations, in many ways.
View Article and Find Full Text PDFUntargeted mass spectrometry (MS) experiments produce complex, multidimensional data that are practically impossible to investigate manually. For this reason, computational pipelines are needed to extract relevant information from raw spectral data and convert it into a more comprehensible format. Depending on the sample type and/or goal of the study, a variety of MS platforms can be used for such analysis.
View Article and Find Full Text PDFPlant specialized metabolites have diversified vastly over the course of plant evolution, and they are considered key players in complex interactions between plants and their environment. The chemical diversity of these metabolites has been widely explored and utilized in agriculture and crop enhancement, the food industry, and drug development, among other areas. However, the immensity of the plant metabolome can make its exploration challenging.
View Article and Find Full Text PDFUntargeted metabolomics promises comprehensive characterization of small molecules in biological samples. However, the field is hampered by low annotation rates and abstract spectral data. Despite recent advances in computational metabolomics, manual annotations and manual confirmation of in-silico annotations remain important in the field.
View Article and Find Full Text PDFDespite the increasing availability of tandem mass spectrometry (MS/MS) community spectral libraries for untargeted metabolomics over the past decade, the majority of acquired MS/MS spectra remain uninterpreted. To further aid in interpreting unannotated spectra, we created a nearest neighbor suspect spectral library, consisting of 87,916 annotated MS/MS spectra derived from hundreds of millions of MS/MS spectra originating from published untargeted metabolomics experiments. Entries in this library, or "suspects," were derived from unannotated spectra that could be linked in a molecular network to an annotated spectrum.
View Article and Find Full Text PDFDevelopments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature.
View Article and Find Full Text PDFUV-B radiation regulates numerous morphogenic, biochemical and physiological responses in plants, and can stimulate some responses typically associated with other abiotic and biotic stimuli, including invertebrate herbivory. Removal of UV-B from the growing environment of various plant species has been found to increase their susceptibility to consumption by invertebrate pests, however, to date, little research has been conducted to investigate the effects of UV-B on crop susceptibility to field pests. Here, we report findings from a multi-omic and genetic-based study investigating the mechanisms of UV-B-stimulated resistance of the crop, Brassica napus (oilseed rape), to herbivory from an economically important lepidopteran specialist of the Brassicaceae, Plutella xylostella (diamondback moth).
View Article and Find Full Text PDFRibosomally synthesized and post-translationally modified peptides (RiPPs) are a chemically diverse class of metabolites. Many RiPPs show potent biological activities that make them attractive starting points for drug development. A promising approach for the discovery of new classes of RiPPs is genome mining.
View Article and Find Full Text PDFFungus-growing ants depend on a fungal mutualist that can fall prey to fungal pathogens. This mutualist is cultivated by these ants in structures called fungus gardens. Ants exhibit weeding behaviors that keep their fungus gardens healthy by physically removing compromised pieces.
View Article and Find Full Text PDFMetabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive alternative is searching for so-called analogues as a starting point for structural annotations; analogues are library molecules which are not exact matches but display a high chemical similarity.
View Article and Find Full Text PDF