Publications by authors named "Domingo Fernandez"

Background: Inflammatory Bowel Disease (IBD), comprising Crohn's Disease (CD) and Ulcerative Colitis (UC), is a complex condition with diverse manifestations; recent advances in multi-omics technologies are helping researchers unravel its molecular characteristics to develop targeted treatments.

Objective: In this work, we explored one of the largest multi-omics cohorts in Inflammatory Bowel Disease, the Study of a Prospective Adult Research Cohort (SPARC IBD), with the goal of identifying predictive biomarkers for CD and UC and elucidating patient subtypes.

Design: We analyzed genomics, transcriptomics (gut biopsy samples), and proteomics (blood plasma) from hundreds of patients from SPARC IBD.

View Article and Find Full Text PDF

Ion Mobility coupled with Mass Spectrometry (IM-MS) is a promising analytical technique that enhances molecular characterization by measuring collision cross-section (CCS) values, which are indicative of the molecular size and shape. However, the effective application of CCS values in structural analysis is still constrained by the limited availability of experimental data, necessitating the development of accurate machine learning (ML) models for in silico predictions. In this study, we evaluated state-of-the-art Graph Neural Networks (GNNs), trained to predict CCS values using the largest publicly available dataset to date.

View Article and Find Full Text PDF

Natural products (NPs) or their derivatives represent a large proportion of drugs that successfully progress through clinical trials to approval. This study explores the presence of NPs in both early- and late-stage drug discovery to determine their success rate, and the factors or features of natural products that contribute to such success. As a proxy for early drug development stages, we analyzed patent applications over several decades, finding a consistent proportion of NP, NP-derived, and synthetic-compound-based patent documents, with the latter group outnumbering NP and NP-derived ones (approximately 77% vs 23%).

View Article and Find Full Text PDF

Biomedical entity linking (BioEL) is the process of connecting entities referenced in documents to entries in biomedical databases such as the Unified Medical Language System (UMLS) or Medical Subject Headings (MeSH). The study objective was to comprehensively evaluate nine recent state-of-the-art biomedical entity linking models under a unified framework. We compare these models along axes of (1) accuracy, (2) speed, (3) ease of use, (4) generalization, and (5) adaptability to new ontologies and datasets.

View Article and Find Full Text PDF

Calculating spectral similarity is a fundamental step in MS/MS data analysis in untargeted metabolomics experiments, as it facilitates the identification of related spectra and the annotation of compounds. To improve matching accuracy when querying an experimental mass spectrum against a spectral library, previous approaches have proposed increasing peak intensities for high / ranges. These high / values tend to be smaller in magnitude, yet they offer more crucial information for identifying the chemical structure.

View Article and Find Full Text PDF

Background: While the amyloid/tau/neurodegeneration (ATN) framework has found wide application in Alzheimer's disease research, it is unclear if thresholds obtained using distinct thresholding methods are concordant within the same dataset and interchangeable across cohorts.

Objectives: To investigate the robustness of data-driven thresholding methods and ATN profiling across cohort datasets.

Design And Setting: We evaluated the impact of thresholding methods on ATN profiles by applying five commonly-used methodologies across cohort datasets.

View Article and Find Full Text PDF

Objectives: Knowledge graphs and ontologies in the biomedical domain provide rich contextual knowledge for a variety of challenges. Employing that for knowledge-driven NLP tasks such as gene-disease association prediction represents a promising way to increase the predictive power of a model.

Methods: We investigated the power of infusing the embedding of two aligned ontologies as prior knowledge to the NLP models.

View Article and Find Full Text PDF

Plants are one of the primary sources of natural products for drug development. However, despite centuries of research, only a limited region of the phytochemical space has been studied. To understand the scope of what is explored versus unexplored in the phytochemical space, we begin by reconstructing the known chemical space of the plant kingdom, mapping the distribution of secondary metabolites, chemical classes, and plants traditionally used for medicinal purposes (i.

View Article and Find Full Text PDF

For millennia, numerous cultures and civilizations have relied on traditional remedies derived from plants to treat a wide range of conditions and ailments. Here, we systematically analyzed ethnobotanical patterns across taxonomically related plants, demonstrating that congeneric medicinal plants are more likely to be used for treating similar indications. Next, we reconstructed the phytochemical space covered by medicinal plants to reveal that (i) taxonomically related medicinal plants cover a similar phytochemical space, and (ii) chemical similarity correlates with similar therapeutic usage.

View Article and Find Full Text PDF

Adverse drug events constitute a major challenge for the success of clinical trials. Several computational strategies have been suggested to estimate the risk of adverse drug events in preclinical drug development. While these approaches have demonstrated high utility in practice, they are at the same time limited to specific information sources.

View Article and Find Full Text PDF

Better understanding the transcriptomic response produced by a compound perturbing its targets can shed light on the underlying biological processes regulated by the compound. However, establishing the relationship between the induced transcriptomic response and the target of a compound is non-trivial, partly because targets are rarely differentially expressed. Therefore, connecting both modalities requires orthogonal information (e.

View Article and Find Full Text PDF

Schizophrenia and bipolar disorder are characterized by highly similar neuropsychological signatures, implying shared neurobiological mechanisms between these two disorders. These disorders also have comorbidities, such as type 2 diabetes mellitus (T2DM). To date, an understanding of the mechanisms that mediate the link between these two disorders remains incomplete.

View Article and Find Full Text PDF

The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities such as small molecules, proteins, cell lines, and clinical trials. However, existing registries have struggled to provide sufficient coverage and metadata standards that meet the evolving needs of modern life sciences researchers.

View Article and Find Full Text PDF

Recent advances in Knowledge Graphs (KGs) and Knowledge Graph Embedding Models (KGEMs) have led to their adoption in a broad range of fields and applications. The current publishing system in machine learning requires newly introduced KGEMs to achieve state-of-the-art performance, surpassing at least one benchmark in order to be published. Despite this, dozens of novel architectures are published every year, making it challenging for users, even within the field, to deduce the most suitable configuration for a given application.

View Article and Find Full Text PDF

Excess labile heme, occurring under hemolytic conditions, displays a versatile modulator in the blood coagulation system. As such, heme provokes prothrombotic states, either by binding to plasma proteins or through interaction with participating cell types. However, despite several independent reports on these effects, apparently contradictory observations and significant knowledge gaps characterize this relationship, which hampers a complete understanding of heme-driven coagulopathies and the development of suitable and specific treatment options.

View Article and Find Full Text PDF

Background: A polypill that includes key medications associated with improved outcomes (aspirin, angiotensin-converting-enzyme [ACE] inhibitor, and statin) has been proposed as a simple approach to the secondary prevention of cardiovascular death and complications after myocardial infarction.

Methods: In this phase 3, randomized, controlled clinical trial, we assigned patients with myocardial infarction within the previous 6 months to a polypill-based strategy or usual care. The polypill treatment consisted of aspirin (100 mg), ramipril (2.

View Article and Find Full Text PDF

Distinct gene expression patterns within cells are foundational for the diversity of functions and unique characteristics observed in specific contexts, such as human tissues and cell types. Though some biological processes commonly occur across contexts, by harnessing the vast amounts of available gene expression data, we can decipher the processes that are unique to a specific context. Therefore, with the goal of developing a portrait of context-specific patterns to better elucidate how they govern distinct biological processes, this work presents a large-scale exploration of transcriptomic signatures across three different contexts (i.

View Article and Find Full Text PDF

Motivation: The importance of clinical data in understanding the pathophysiology of complex disorders has prompted the launch of multiple initiatives designed to generate patient-level data from various modalities. While these studies can reveal important findings relevant to the disease, each study captures different yet complementary aspects and modalities which, when combined, generate a more comprehensive picture of disease etiology. However, achieving this requires a global integration of data across studies, which proves to be challenging given the lack of interoperability of cohort datasets.

View Article and Find Full Text PDF

Background: Currently, Alzheimer's disease (AD) cohort datasets are difficult to find and lack across-cohort interoperability, and the actual content of publicly available datasets often only becomes clear to third-party researchers once data access has been granted. These aspects severely hinder the advancement of AD research through emerging data-driven approaches such as machine learning and artificial intelligence and bias current data-driven findings towards the few commonly used, well-explored AD cohorts. To achieve robust and generalizable results, validation across multiple datasets is crucial.

View Article and Find Full Text PDF

Pathway enrichment analysis has become a widely used knowledge-based approach for the interpretation of biomedical data. Its popularity has led to an explosion of both enrichment methods and pathway databases. While the elegance of pathway enrichment lies in its simplicity, multiple factors can impact the results of such an analysis, which may not be accounted for.

View Article and Find Full Text PDF

Network-based approaches are becoming increasingly popular for drug discovery as they provide a systems-level overview of the mechanisms underlying disease pathophysiology. They have demonstrated significant early promise over other methods of biological data representation, such as in target discovery, side effect prediction and drug repurposing. In parallel, an explosion of -omics data for the deep characterization of biological systems routinely uncovers molecular signatures of disease for similar applications.

View Article and Find Full Text PDF

Despite available vaccinations COVID-19 case numbers around the world are still growing, and effective medications against severe cases are lacking. In this work, we developed a machine learning model which predicts mortality for COVID-19 patients using data from the multi-center 'Lean European Open Survey on SARS-CoV-2-infected patients' (LEOSS) observational study (>100 active sites in Europe, primarily in Germany), resulting into an AUC of almost 80%. We showed that molecular mechanisms related to dementia, one of the relevant predictors in our model, intersect with those associated to COVID-19.

View Article and Find Full Text PDF

Motivation: The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models. However, representations based on a single modality are inherently limited.

View Article and Find Full Text PDF

The utility of pathway signatures lies in their capability to determine whether a specific pathway or biological process is dysregulated in a given patient. These signatures have been widely used in machine learning (ML) methods for a variety of applications including precision medicine, drug repurposing, and drug discovery. In this work, we leverage highly predictive ML models for drug response simulation in individual patients by calibrating the pathway activity scores of disease samples.

View Article and Find Full Text PDF