Publications by Charles T Hoyt | LitMetric

Publications by authors named "Charles T Hoyt"

Page 1 of 2

Eliater: a Python package for estimating outcomes of perturbations in biomolecular networks.

Sara Mohammad-Taheri Pruthvi Prakash Navada Charles Tapley Hoyt Jeremy Zucker Karen Sachs

Bioinformatics

September 2024

Summary: We introduce Eliater, a Python package for estimating the effect of perturbation of an upstream molecule on a downstream molecule in a biomolecular network. The estimation takes as input a biomolecular network, observational biomolecular data, and a perturbation of interest, and outputs an estimated quantitative effect of the perturbation. We showcase the functionalities of Eliater in a case study of Escherichia coli transcriptional regulatory network.

View Article and Find Full Text PDF

The O3 guidelines: open data, open code, and open infrastructure for sustainable curated scientific resources.

Charles Tapley Hoyt Benjamin M Gyori

Sci Data

May 2024

Curated resources that support scientific research often go out of date or become inaccessible. This can happen for several reasons including lack of continuing funding, the departure of key personnel, or changes in institutional priorities. We introduce the Open Data, Open Code, Open Infrastructure (O3) Guidelines as an actionable road map to creating and maintaining resources that are less susceptible to such external factors and can continue to be used and maintained by the community that they serve.

View Article and Find Full Text PDF

An open source knowledge graph ecosystem for the life sciences.

Tiffany J Callahan Ignacio J Tripodi Adrianne L Stefanski Luca Cappelletti Sanya B Taneja Charles Tapley Hoyt

Sci Data

April 2024

Article Synopsis

Translational research needs data from different levels of biological systems, but combining that data is tough for scientists.
New technologies help gather more data, but researchers struggle to organize all the information effectively.
PheKnowLator is a tool that helps scientists create customizable knowledge graphs easily, making it better for managing complex health information without slowing down their work.

View Article and Find Full Text PDF

The Human Phenotype Ontology in 2024: phenotypes around the world.

Michael A Gargano Nicolas Matentzoglu Ben Coleman Eunice B Addo-Lartey Anna V Anagnostopoulos Charles Tapley Hoyt

Nucleic Acids Res

January 2024

The Human Phenotype Ontology (HPO) is a widely used resource that comprehensively organizes and defines the phenotypic features of human disease, enabling computational inference and supporting genomic and phenotypic analyses through semantic similarity and machine learning algorithms. The HPO has widespread applications in clinical diagnostics and translational research, including genomic diagnostics, gene-disease discovery, and cohort analytics. In recent years, groups around the world have developed translations of the HPO from English to other languages, and the HPO browser has been internationalized, allowing users to view HPO term labels and in many cases synonyms and definitions in ten languages in addition to English.

View Article and Find Full Text PDF

Improving reproducibility and reusability in the Journal of Cheminformatics.

Charles Tapley Hoyt Barbara Zdrazil Rajarshi Guha Nina Jeliazkova Karina Martinez-Mayorga

J Cheminform

June 2023

View Article and Find Full Text PDF

Democratizing knowledge representation with BioCypher.

Sebastian Lobentanzer Patrick Aloy Jan Baumbach Balazs Bohar Vincent J Carey Charles Tapley Hoyt

Nat Biotechnol

August 2023

View Article and Find Full Text PDF

Prediction and curation of missing biomedical identifier mappings with Biomappings.

Charles Tapley Hoyt Amelia L Hoyt Benjamin M Gyori

Bioinformatics

April 2023

Motivation: Biomedical identifier resources (such as ontologies, taxonomies, and controlled vocabularies) commonly overlap in scope and contain equivalent entries under different identifiers. Maintaining mappings between these entries is crucial for interoperability and the integration of data and knowledge. However, there are substantial gaps in available mappings motivating their semi-automated curation.

View Article and Find Full Text PDF

Gilda: biomedical entity text normalization with machine-learned disambiguation as a service.

Benjamin M Gyori Charles Tapley Hoyt Albert Steppi

Bioinform Adv

May 2022

Summary: Gilda is a software tool and web service that implements a scored string matching algorithm for names and synonyms across entries in biomedical ontologies covering genes, proteins (and their families and complexes), small molecules, biological processes and diseases. Gilda integrates machine-learned disambiguation models to choose between ambiguous strings given relevant surrounding text as context, and supports species-prioritization in case of ambiguity.

Availability And Implementation: The Gilda web service is available at http://grounding.

View Article and Find Full Text PDF

Unifying the identification of biomedical entities with the Bioregistry.

Charles Tapley Hoyt Meghan Balk Tiffany J Callahan Daniel Domingo-Fernández Melissa A Haendel

Sci Data

November 2022

The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities such as small molecules, proteins, cell lines, and clinical trials. However, existing registries have struggled to provide sufficient coverage and metadata standards that meet the evolving needs of modern life sciences researchers.

View Article and Find Full Text PDF

Ontology Development Kit: a toolkit for building, maintaining and standardizing biomedical ontologies.

Nicolas Matentzoglu Damien Goutte-Gattat Shawn Zheng Kai Tan James P Balhoff Seth Carbon Charles Tapley Hoyt

Database (Oxford)

October 2022

Similar to managing software packages, managing the ontology life cycle involves multiple complex workflows such as preparing releases, continuous quality control checking and dependency management. To manage these processes, a diverse set of tools is required, from command-line utilities to powerful ontology-engineering environmentsr. Particularly in the biomedical domain, which has developed a set of highly diverse yet inter-dependent ontologies, standardizing release practices and metadata and establishing shared quality standards are crucial to enable interoperability.

View Article and Find Full Text PDF

A review of biomedical datasets relating to drug discovery: a knowledge graph perspective.

Stephen Bonner Ian P Barrett Cheng Ye Rowan Swiers Ola Engkvist Charles Tapley Hoyt

Brief Bioinform

November 2022

Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene-disease prioritization.

View Article and Find Full Text PDF

PyBioPAX: biological pathway exchange in Python.

Benjamin M Gyori Charles Tapley Hoyt

J Open Source Softw

March 2022

View Article and Find Full Text PDF

Do-calculus enables estimation of causal effects in partially observed biomolecular pathways.

Sara Mohammad-Taheri Jeremy Zucker Charles Tapley Hoyt Karen Sachs Vartika Tewari

Bioinformatics

June 2022

Motivation: Estimating causal queries, such as changes in protein abundance in response to a perturbation, is a fundamental task in the analysis of biomolecular pathways. The estimation requires experimental measurements on the pathway components. However, in practice many pathway components are left unobserved (latent) because they are either unknown, or difficult to measure.

View Article and Find Full Text PDF

Integrating multi-omics data reveals function and therapeutic potential of deubiquitinating enzymes.

Laura M Doherty Caitlin E Mills Sarah A Boswell Xiaoxi Liu Charles Tapley Hoyt

Elife

June 2022

Deubiquitinating enzymes (DUBs), ~100 of which are found in human cells, are proteases that remove ubiquitin conjugates from proteins, thereby regulating protein turnover. They are involved in a wide range of cellular activities and are emerging therapeutic targets for cancer and other diseases. Drugs targeting USP1 and USP30 are in clinical development for cancer and kidney disease respectively.

View Article and Find Full Text PDF

A Simple Standard for Sharing Ontological Mappings (SSSOM).

Nicolas Matentzoglu James P Balhoff Susan M Bello Chris Bizon Matthew Brush Amelia L Hoyt Charles T Hoyt

Database (Oxford)

May 2022

Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction).

View Article and Find Full Text PDF

STonKGs: a sophisticated transformer trained on biomedical text and knowledge graphs.

Helena Balabin Charles Tapley Hoyt Colin Birkenbihl Benjamin M Gyori John Bachman

Bioinformatics

March 2022

Motivation: The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models. However, representations based on a single modality are inherently limited.

View Article and Find Full Text PDF

Bringing Light Into the Dark: A Large-Scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework.

Mehdi Ali Max Berrendorf Charles Tapley Hoyt Laurent Vermue Mikhail Galkin

IEEE Trans Pattern Anal Mach Intell

December 2022

The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. To assess the reproducibility of previously published results, we re-implemented and evaluated 21 models in the PyKEEN software package. In this paper, we outline which results could be reproduced with their reported hyper-parameters, which could only be reproduced with alternate hyper-parameters, and which could not be reproduced at all, as well as provide insight as to why this might be the case.

View Article and Find Full Text PDF

The role of metadata in reproducible computational research.

Jeremy Leipzig Daniel Nüst Charles Tapley Hoyt Karthik Ram Jane Greenberg

Patterns (N Y)

September 2021

Reproducible computational research (RCR) is the keystone of the scientific method for analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, improving the reproducibility of scientific studies can accelerate evaluation and reuse. This potential and wide support for the FAIR principles have motivated interest in metadata standards supporting reproducibility.

View Article and Find Full Text PDF

CLEP: a hybrid data- and knowledge-driven framework for generating patient representations.

Vinay Srinivas Bharadhwaj Mehdi Ali Colin Birkenbihl Sarah Mubeen Jens Lehmann Charles Tapley Hoyt

Bioinformatics

October 2021

Summary: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data.

View Article and Find Full Text PDF

A Systems Biology Approach for Hypothesizing the Effect of Genetic Variants on Neuroimaging Features in Alzheimer's Disease.

Sepehr Golriz Khatami Daniel Domingo-Fernández Sarah Mubeen Charles Tapley Hoyt Christine Robinson

J Alzheimers Dis

September 2021

Background: Neuroimaging markers provide quantitative insight into brain structure and function in neurodegenerative diseases, such as Alzheimer's disease, where we lack mechanistic insights to explain pathophysiology. These mechanisms are often mediated by genes and genetic variations and are often studied through the lens of genome-wide association studies. Linking these two disparate layers (i.

View Article and Find Full Text PDF

Leveraging Structured Biological Knowledge for Counterfactual Inference: A Case Study of Viral Pathogenesis.

Jeremy Zucker Kaushal Paneri Sara Mohammad-Taheri Somya Bhargava Pallavi Kolambkar Charles Tapley Hoyt

IEEE Trans Big Data

March 2021

Counterfactual inference is a useful tool for comparing outcomes of interventions on complex systems. It requires us to represent the system in form of a structural causal model, complete with a causal diagram, probabilistic assumptions on exogenous variables, and functional assignments. Specifying such models can be extremely difficult in practice.

View Article and Find Full Text PDF

GuiltyTargets: Prioritization of Novel Therapeutic Targets With Network Representation Learning.

Ozlem Muslu Charles Tapley Hoyt Mauricio Lacerda Martin Hofmann-Apitius Holger Frohlich

IEEE/ACM Trans Comput Biol Bioinform

February 2022

The majority of clinical trials fail due to low efficacy of investigated drugs, often resulting from a poor choice of target protein. Existing computational approaches aim to support target selection either via genetic evidence or by putting potential targets into the context of a disease specific network reconstruction. The purpose of this work was to investigate whether network representation learning techniques could be used to allow for a machine learning based prioritization of putative targets.

View Article and Find Full Text PDF

The Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST).

Vasundra Touré Steven Vercruysse Marcio Luis Acencio Ruth C Lovering Sandra Orchard Charles Tapley Hoyt

Bioinformatics

April 2021

Motivation: A large variety of molecular interactions occurs between biomolecular components in cells. When a molecular interaction results in a regulatory effect, exerted by one component onto a downstream component, a so-called 'causal interaction' takes place. Causal interactions constitute the building blocks in our understanding of larger regulatory networks in cells.

View Article and Find Full Text PDF

PS4DR: a multimodal workflow for identification and prioritization of drugs based on pathway signatures.

Mohammad Asif Emon Daniel Domingo-Fernández Charles Tapley Hoyt Martin Hofmann-Apitius

BMC Bioinformatics

June 2020

Background: During the last decade, there has been a surge towards computational drug repositioning owing to constantly increasing -omics data in the biomedical research field. While numerous existing methods focus on the integration of heterogeneous data to propose candidate drugs, it is still challenging to substantiate their results with mechanistic insights of these candidate drugs. Therefore, there is a need for more innovative and efficient methods which can enable better integration of data and knowledge for drug repositioning.

View Article and Find Full Text PDF

Corrigendum: The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling.

Sarah Mubeen Charles Tapley Hoyt André Gemünd Martin Hofmann-Apitius Holger Fröhlich

Front Genet

April 2020

[This corrects the article DOI: 10.3389/fgene.2019.

View Article and Find Full Text PDF