A Systematic Analysis of Term Reuse and Term Overlap across Biomedical Ontologies.

Semant Web

Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University.

Published: January 2017

Reusing ontologies and their terms is a principle and best practice that most ontology development methodologies strongly encourage. Reuse comes with the promise to support the semantic interoperability and to reduce engineering costs. In this paper, we present a descriptive study of the current extent of term reuse and overlap among biomedical ontologies. We use the corpus of biomedical ontologies stored in the BioPortal repository, and analyze different types of reuse and overlap constructs. While we find an approximate term overlap between 25-31%, the term reuse is only <9%, with most ontologies reusing fewer than 5% of their terms from a small set of popular ontologies. Clustering analysis shows that the terms reused by a common set of ontologies have >90% semantic similarity, hinting that ontology developers tend to reuse terms that are sibling or parent-child nodes. We validate this finding by analysing the logs generated from a Protégé plugin that enables developers to reuse terms from BioPortal. We find most reuse constructs were 2-level subtrees on the higher levels of the class hierarchy. We developed a Web application that visualizes reuse dependencies and overlap among ontologies, and that proposes similar terms from BioPortal for a term of interest. We also identified a set of error patterns that indicate that ontology developers did intend to reuse terms from other ontologies, but that they were using different and sometimes incorrect representations. Our results stipulate the need for semi-automated tools that augment term reuse in the ontology engineering process through personalized recommendations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5555235PMC
http://dx.doi.org/10.3233/sw-160238DOI Listing

Publication Analysis

Top Keywords

term reuse
16
biomedical ontologies
12
reuse terms
12
reuse
11
term overlap
8
overlap biomedical
8
reuse overlap
8
ontology developers
8
terms bioportal
8
term
7

Similar Publications

Survival of Salmonella on Biodegradable Mulch, Landscape Fabric, and Plastic Mulch.

J Food Prot

December 2024

Department of Food Science and Technology, Virginia Tech, Blacksburg, VA, USA. Electronic address:

Ground covers areused inproduce production to enhance plant growth and control diseases and pests. While various factors are considered when selecting commercial ground covers, food safety, particularly the survival of foodborne pathogens, is often overlooked. This study aimed to assess the survival ofSalmonellaon different ground covers, including biodegradable mulch, landscape fabric, and plastic mulch.

View Article and Find Full Text PDF

Everyday clinical care generates vast amounts of digital data. A broad range of actors are interested in reusing these data for various purposes. Such reuse of health data could support medical research, healthcare planning, technological innovation, and lead to increased financial revenue.

View Article and Find Full Text PDF

Depth weakens effects of long-term fertilization on dissolved organic matter chemodiversity in paddy soils.

Sci Total Environ

December 2024

State Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 211135, China; Changshu National Agro-Ecosystem Observation and Research Station, Institute of Soil Science, Chinese Academy of Sciences, 211135 Nanjing, China. Electronic address:

Dissolved organic matter (DOM) is pivotal for soil biogeochemical processes, soil fertility, and ecosystem stability. While numerous studies have investigated the impact of fertilization practices on DOM content along soil profiles, variations in DOM chemodiversity and the underlying factors across soil profiles under long-term fertilization regimes remain unclear. Using Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) and high-throughput sequencing, this study investigated DOM composition characteristics and microbial community compositions across different soil layers (0-20, 20-40, 40-60, and 60-100 cm) in paddy soil under different long-term fertilization treatments, including Control (no fertilizer), NPK (mineral NPK fertilizer), NPKHS (NPK fertilizer with half straw return), and NPKS (NPK fertilizer with full straw return).

View Article and Find Full Text PDF

Reusing treated wastewater (TWW) for crop irrigation has shown to provide environmental and economic benefits as well as drawbacks. This study was conducted using soils collected from a wastewater reuse facility in Tallahassee, FL, mainly to elucidate the long-term impact(s) of TWW irrigation on soil microbiome and nutrient status. Approximately 890 ha of land have been spray-irrigated with TWW since the 1980's to grow fodder crops.

View Article and Find Full Text PDF

Sudden shocking load events featuring significant increases in inflow quantities or concentrations of wastewater treatment plants (WWTPs), are a major threat to the attainment of treated effluents to discharge quality standards. To aid in real-time decision-making for stable WWTP operations, this study developed a probabilistic deep learning model that comprises encoder-decoder long short-term memory (LSTM) networks with added capacity of producing probability predictions, to enhance the robustness of real-time WWTP effluent quality prediction under such events. The developed probabilistic encoder-decoder LSTM (P-ED-LSTM) model was tested in an actual WWTP, where bihourly effluent quality prediction of total nitrogen was performed and compared with classical deep learning models, including LSTM, gated recurrent unit (GRU) and Transformer.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!