Background: The National Cancer Institute (NCI) Thesaurus provides reference terminology for NCI and other systems. Previously, we proposed a hybrid prototype utilizing lexical features and role definitions of concepts in non-lattice subgraphs to identify missing IS-A relations in the NCI Thesaurus. However, no domain expert evaluation was provided in our previous work. In this paper, we further enhance the hybrid approach by leveraging a novel lexical feature-roots of noun chunks within concept names. Formal evaluation of our enhanced approach is also performed.

Method: We first compute all the non-lattice subgraphs in the NCI Thesaurus. We model each concept using its role definitions, words and roots of noun chunks within its concept name and its ancestor's names. Then we perform subsumption testing for candidate concept pairs in the non-lattice subgraphs to automatically detect potentially missing IS-A relations. Domain experts evaluated the validity of these relations.

Results: We applied our approach to 19.08d version of the NCI Thesaurus. A total of 55 potentially missing IS-A relations were identified by our approach and reviewed by domain experts. 29 out of 55 were confirmed as valid by domain experts and have been incorporated in the newer versions of the NCI Thesaurus. 7 out of 55 further revealed incorrect existing IS-A relations in the NCI Thesaurus.

Conclusions: The results showed that our hybrid approach by leveraging lexical features and role definitions is effective in identifying potentially missing IS-A relations in the NCI Thesaurus.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7737275PMC
http://dx.doi.org/10.1186/s12911-020-01289-6DOI Listing

Publication Analysis

Top Keywords

nci thesaurus
28
is-a relations
24
missing is-a
20
relations nci
16
hybrid approach
12
role definitions
12
non-lattice subgraphs
12
domain experts
12
nci
9
lexical features
8

Similar Publications

Gene set enrichment is a mainstay of functional genomics, but it relies on gene function databases that are incomplete. Here we evaluate five large language models (LLMs) for their ability to discover the common functions represented by a gene set, supported by molecular rationale and a self-confidence assessment. For curated gene sets from Gene Ontology, GPT-4 suggests functions similar to the curated name in 73% of cases, with higher self-confidence predicting higher similarity.

View Article and Find Full Text PDF

Background: The accuracy of administrative codes to capture patients with both primary biliary cholangitis (PBC) and cirrhosis could be challenging because of the potential for incorrect coding due to the old nomenclature "Primary Biliary Cirrhosis." Therefore, the aim of this study was to examine the positive predictive value (PPV) of International Classification of Diseases (ICD) codes for PBC and cirrhosis.

Methods: This was a retrospective cohort study using data from the VA Corporate Data Warehouse.

View Article and Find Full Text PDF
Article Synopsis
  • The Geriatric Depression Scale (GDS-15), commonly used to gauge depression in older adults, shows that a score of ≥5 identifies higher prevalence (34.2%) compared to the Structured Clinical Interview (SCID) which shows a lower prevalence (14.8%).
  • An analysis of data from 14 studies involving over 3,600 participants found that using GDS-15 with a cutoff of ≥8 aligns much closer to SCID results, with only a minor difference (-0.3%).
  • While GDS-15 ≥5 greatly overestimates depression prevalence, the suggested cutoff of ≥8 might be more accurate but has too much variation to be reliably implemented; hence, validated diagnostic
View Article and Find Full Text PDF

This manuscript describes the development of a resource module that is part of a learning platform named 'NIGMS Sandbox for Cloud-based Learning' (https://github.com/NIGMS/NIGMS-Sandbox). The module delivers learning materials on Cloud-based Consensus Pathway Analysis in an interactive format that uses appropriate cloud resources for data access and analyses.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!