Publications by authors named "Henning Hermjakob"

The human body contains trillions of cells, classified into specific cell types, with diverse morphologies and functions. In addition, cells of the same type can assume different states within an individual's body during their lifetime. Understanding the complexities of the proteome in the context of a human organism and its many potential states is a necessary requirement to understanding human biology, but these complexities can neither be predicted from the genome, nor have they been systematically measurable with available technologies.

View Article and Find Full Text PDF
Article Synopsis
  • Biomedical research is increasingly integrating artificial intelligence (AI) and machine learning (ML) to tackle complex challenges, necessitating a focus on ethical and explainable AI (XAI) due to the complexities of deep learning methods.
  • The NIH's Bridge2AI program is working on creating new flagship datasets aimed at enhancing AI/ML applications in biomedicine while establishing best practices, tools, standards, and criteria for assessing the data's AI readiness, including legal and ethical considerations.
  • The article outlines foundational criteria developed by the NIH Bridge2AI Standards Working Group to ensure the scientific rigor and ethical use of AI in biomedical research, emphasizing the need for ongoing adaptation as the field evolves.
View Article and Find Full Text PDF

Macromolecular protein complexes carry out most functions in the cell including essential functions required for cell survival. Unfortunately, we lack the subunit composition for all human protein complexes. To address this gap we integrated >25,000 mass spectrometry experiments using a machine learning approach to identify > 15,000 human protein complexes.

View Article and Find Full Text PDF

Foundation Models (FMs) are gaining increasing attention in the biomedical artificial intelligence (AI) ecosystem due to their ability to represent and contextualize multimodal biomedical data. These capabilities make FMs a valuable tool for a variety of tasks, including biomedical reasoning, hypothesis generation, and interpreting complex imaging data. In this review paper, we address the unique challenges associated with establishing an ethical and trustworthy biomedical AI ecosystem, with a particular focus on the development of FMs and their downstream applications.

View Article and Find Full Text PDF

The International Molecular Exchange Consortium (IMEx) has evolved into a vital partnership of open resources dedicated to curating molecular interaction data from the scientific literature. This consortium, which includes IntAct, MINT, MatrixDB, and DIP, is a collaborative effort with a central mission of aggregating detailed molecular interaction experimental evidence in a machine-readable format, supported by controlled vocabularies and standard ontologies. The IntAct molecular interaction database (www.

View Article and Find Full Text PDF

Motivation: Chemical reaction networks (CRNs) play a pivotal role in diverse fields such as systems biology, biochemistry, chemical engineering, and epidemiology. High-level definitions of CRNs enables to use various simulation approaches, including deterministic and stochastic methods, from the same model. However, existing Python tools for simulation of CRN typically wrap external C/C++ libraries for model definition, translation into equations and/or numerically solving them, limiting their extensibility and integration with the broader Python ecosystem.

View Article and Find Full Text PDF

Motivation: ReactomeGSA is part of the Reactome knowledgebase and one of the leading multi-omics pathway analysis platforms. ReactomeGSA provides access to quantitative pathway analysis methods supporting different 'omics data types. Additionally, ReactomeGSA can process different datasets simultaneously, leading to a comparative pathway analysis that can also be performed across different species.

View Article and Find Full Text PDF

Motivation: Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge.

View Article and Find Full Text PDF

Germline and somatic mutations can give rise to proteins with altered activity, including both gain and loss-of-function. The effects of these variants can be captured in disease-specific reactions and pathways that highlight the resulting changes to normal biology. A disease reaction is defined as an aberrant reaction in which a variant protein participates.

View Article and Find Full Text PDF

Appreciating the rapid advancement and ubiquity of generative AI, particularly ChatGPT, a chatbot using large language models like GPT, we endeavour to explore the potential application of ChatGPT in the data collection and annotation stages within the Reactome curation process. This exploration aimed to create an automated or semi-automated framework to mitigate the extensive manual effort traditionally required for gathering and annotating information pertaining to biological pathways, adopting a Reactome "reaction-centric" approach. In this pilot study, we used ChatGPT/GPT4 to address gaps in the pathway annotation and enrichment in parallel with the conventional manual curation process.

View Article and Find Full Text PDF

The Reactome Knowledgebase (https://reactome.org), an Elixir and GCBR core biological data resource, provides manually curated molecular details of a broad range of normal and disease-related biological processes. Processes are annotated as an ordered network of molecular transformations in a single consistent data model.

View Article and Find Full Text PDF

Disease variant annotation in the context of biological reactions and pathways can provide a standardized overview of molecular phenotypes of pathogenic mutations that is amenable to computational mining and mathematical modeling. Reactome, an open source, manually curated, peer-reviewed database of human biological pathways, provides annotations for over 4000 disease variants of close to 400 genes in the context of ∼800 disease reactions constituting ∼400 disease pathways. Functional annotation of disease variants proceeds from normal gene functions, through disease variants whose divergence from normal molecular behaviors has been experimentally verified, to extrapolation from molecular phenotypes of characterized variants to variants of unknown significance using criteria of the American College of Medical Genetics and Genomics (ACMG).

View Article and Find Full Text PDF

In recent years, a huge amount of data on ncRNA interactions has been described in scientific papers and databases. Although considerable effort has been made to annotate the available knowledge in public repositories, there are still significant discrepancies in how different resources capture and interpret data on ncRNA functional and physical associations. In the present paper, we present a collection of microRNA-mRNA interactions annotated from the scientific literature following recognized standard criteria and focused on microRNAs, which regulate genes associated with rare diseases as a case study.

View Article and Find Full Text PDF

Rare genetic diseases affect millions, and identifying causal DNA variants is essential for patient care. Therefore, it is imperative to estimate the effect of each independent variant and improve their pathogenicity classification. Our study of 140 214 unrelated UK Biobank (UKB) participants found that each of them carries a median of 7 variants previously reported as pathogenic or likely pathogenic.

View Article and Find Full Text PDF

Understudied or dark proteins have the potential to shed light on as-yet undiscovered molecular mechanisms that underlie phenotypes and suggest innovative therapeutic approaches for many diseases. The Reactome-IDG (Illuminating the Druggable Genome) project aims to place dark proteins in the context of manually curated, highly reliable pathways in Reactome, the most comprehensive, open-source biological pathway knowledgebase, facilitating the understanding functions and predicting therapeutic potentials of dark proteins. The Reactome-IDG web portal, deployed at https://idg.

View Article and Find Full Text PDF

Limited knowledge about a substantial portion of protein coding genes, known as "dark" proteins, hinders our understanding of their functions and potential therapeutic applications. To address this, we leveraged Reactome, the most comprehensive, open source, open-access pathway knowledgebase, to contextualize dark proteins within biological pathways. By integrating multiple resources and employing a random forest classifier trained on 106 protein/gene pairwise features, we predicted functional interactions between dark proteins and Reactome-annotated proteins.

View Article and Find Full Text PDF
Article Synopsis
  • Understanding autoimmunity to the body's own proteins is key for diagnosing and treating autoimmune diseases, prompting the development of the AAgAtlas portal which lists over 8,000 autoantigens linked to various human diseases.
  • The portal allows users to explore the properties and characteristics of these autoantigens, highlighting their evolutionary conservation and common features like hydrophilic amino acids that are often found on protein surfaces.
  • Findings indicate that the production of antibodies targeting these autoantigens is related to genetic variations and abnormal protein expression in diseases, aiding in the identification of potential biomarkers for autoimmune conditions.
View Article and Find Full Text PDF

Pathway databases provide descriptions of the roles of proteins, nucleic acids, lipids, carbohydrates, and other molecular entities within their biological cellular contexts. Pathway-centric views of these roles may allow for the discovery of unexpected functional relationships in data such as gene expression profiles and somatic mutation catalogues from tumor cells. For this reason, there is a high demand for high-quality pathway databases and their associated tools.

View Article and Find Full Text PDF
Article Synopsis
  • Interacting proteins often serve similar functions, affecting the same traits in organisms, and analysis of interaction networks helps identify genes associated with various traits.
  • Through network-based analysis of 1,002 human traits, researchers were able to recover known disease genes and drug targets, revealing that certain traits may share genetic processes.
  • The study identified 73 pleiotropic gene modules related to multiple traits, especially those involved in protein ubiquitination and RNA processing, and demonstrated their relevance in understanding diseases and potential drug repurposing.
View Article and Find Full Text PDF

In this white paper, we describe the founding of a new ELIXIR Community - the Systems Biology Community - and its proposed future contributions to both ELIXIR and the broader community of systems biologists in Europe and worldwide. The Community believes that the infrastructure aspects of systems biology - databases, (modelling) tools and standards development, as well as training and access to cloud infrastructure - are not only appropriate components of the ELIXIR infrastructure, but will prove key components of ELIXIR's future support of advanced biological applications and personalised medicine. By way of a series of meetings, the Community identified seven key areas for its future activities, reflecting both future needs and previous and current activities within ELIXIR Platforms and Communities.

View Article and Find Full Text PDF

The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI standards. We briefly describe the current state of the many existing PSI standards, some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete.

View Article and Find Full Text PDF

As an alternative to one drug-one target approaches, systems biology methods can provide a deeper insight into the holistic effects of drugs. Network-based approaches are tools of systems biology, that can represent valuable methods for visualizing and analysing drug-protein and protein-protein interactions. In this study, a KNIME workflow is presented which connects drugs to causal target proteins and target proteins to their causal protein interactors.

View Article and Find Full Text PDF
Article Synopsis
  • Computational models in systems biology are used to understand the dynamic behaviors of complex biological systems.
  • As the number of these models increases, enhancing their reusability and ability to reproduce experiments becomes essential, requiring proper model annotation.
  • Recent initiatives aim to establish a standardized framework for making computational models in biology more accessible, reproducible, and interoperable, while also addressing existing challenges in the field.
View Article and Find Full Text PDF

Synopsis of recent research by authors named "Henning Hermjakob"

  • - Henning Hermjakob's recent research focuses on the application of artificial intelligence (AI) and machine learning (ML) in biomedical contexts, exploring best practices for ethical AI integration and how these technologies can enhance understanding in complex biomedical challenges.
  • - He has contributed significantly to advancing knowledge of human protein complexes through innovative data integration methods, successfully employing machine learning to analyze over 25,000 proteomic experiments to characterize more than 15,000 human protein complexes.
  • - Additionally, Hermjakob's work emphasizes the importance of data reuse in molecular biology, developing tools like ReactomeGSA and creating frameworks for the ethical application of generative AI in data curation processes, thus facilitating greater accessibility and understanding of biomedical data.