Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations.

Bioinformatics

Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.

Published: July 2018

Motivation: Biological knowledge is widely represented in the form of ontology-based annotations: ontologies describe the phenomena assumed to exist within a domain, and the annotations associate a (kind of) biological entity with a set of phenomena within the domain. The structure and information contained in ontologies and their annotations make them valuable for developing machine learning, data analysis and knowledge extraction algorithms; notably, semantic similarity is widely used to identify relations between biological entities, and ontology-based annotations are frequently used as features in machine learning applications.

Results: We propose the Onto2Vec method, an approach to learn feature vectors for biological entities based on their annotations to biomedical ontologies. Our method can be applied to a wide range of bioinformatics research problems such as similarity-based prediction of interactions between proteins, classification of interaction types using supervised learning, or clustering. To evaluate Onto2Vec, we use the gene ontology (GO) and jointly produce dense vector representations of proteins, the GO classes to which they are annotated, and the axioms in GO that constrain these classes. First, we demonstrate that Onto2Vec-generated feature vectors can significantly improve prediction of protein-protein interactions in human and yeast. We then illustrate how Onto2Vec representations provide the means for constructing data-driven, trainable semantic similarity measures that can be used to identify particular relations between proteins. Finally, we use an unsupervised clustering approach to identify protein families based on their Enzyme Commission numbers. Our results demonstrate that Onto2Vec can generate high quality feature vectors from biological entities and ontologies. Onto2Vec has the potential to significantly outperform the state-of-the-art in several predictive applications in which ontologies are involved.

Availability And Implementation: https://github.com/bio-ontology-research-group/onto2vec.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022543PMC
http://dx.doi.org/10.1093/bioinformatics/bty259DOI Listing

Publication Analysis

Top Keywords

biological entities
16
ontology-based annotations
12
feature vectors
12
entities ontology-based
8
machine learning
8
semantic similarity
8
identify relations
8
vectors biological
8
onto2vec
6
biological
6

Similar Publications

Biologic Downstaging Observed With a Pulmonary Collision Tumor.

Ann Thorac Surg Short Rep

September 2023

Department of Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California.

Collision tumors are a rare entity in which 2 distinct neoplastic cellular populations invade each other and coalesce to form a single focal lesion. This case report describes a pulmonary collision tumor emerging from the rapid progression of 2 large-cell carcinoma lesions of the lung, including 1 nodule with clear cell features and another with basaloid features. The collision of these 2 histologically rare nodules resulted in the biologic downstaging of disease.

View Article and Find Full Text PDF

Human tumors are diverse in their natural history and response to treatment, which in part results from genetic and transcriptomic heterogeneity. In clinical practice, single-site needle biopsies are used to sample this diversity, but cancer biomarkers may be confounded by spatiogenomic heterogeneity within individual tumors. Here we investigate clonally expressed genes as a solution to the sampling bias problem by analyzing multiregion whole-exome and RNA sequencing data for 450 tumor regions from 184 patients with lung adenocarcinoma in the TRACERx study.

View Article and Find Full Text PDF

Some patients with neuromyelitis optica spectrum disorder (NMOSD)-like symptoms test negative for anti-aquaporin-4 (anti-AQP4) antibodies. Among them, a subset has antibodies targeting myelin oligodendrocyte glycoprotein (MOG), a condition now termed MOG antibody-associated disease (MOGAD). MOGAD shares features with NMOSD, like optic neuritis and myelitis, but differs in pathophysiology, clinical presentation, imaging findings, and biomarkers.

View Article and Find Full Text PDF

Vaccination against measles-mumps-rubella and rates of non-targeted infectious disease hospitalisations: Nationwide register-based cohort studies in Denmark, Finland, Norway, and Sweden.

J Infect

January 2025

Bandim Health Project, Research Unit OPEN, Department of Clinical Research, University of Southern Denmark, Odense C, Denmark; Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus, Denmark.

Objectives: To investigate if receipt of measles-mumps-rubella (MMR) vaccine following the third dose of diphtheria-tetanus-acellular pertussis (DTaP3) is associated with reduced rates of non-targeted infectious disease hospitalisations.

Methods: Register based cohort study following 1,397,027 children born in Denmark, Finland, Norway, and Sweden until 2 years of age. Rates of infectious disease hospitalisations with minimum one overnight stay according to time-varying vaccination status were compared using Cox proportional hazards regression analysis with age as the underlying timescale and including multiple covariates.

View Article and Find Full Text PDF

In 2024, the U.S. Food and Drug Administration (FDA) has approved a range of new drugs, including both 32 new chemical entities (NCEs) and 18 biological entities (NBEs).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!