Information retrieval on oncology knowledge base using recursive paraphrase lattice.

J Biomed Inform

Xcoo, Inc., Mitsuyama Bldg., 4F, 4-2-5, Hongo, Bunkyo-ku, Tokyo, Japan. Electronic address:

Published: April 2021

For annotation in cancer genomic medicine, oncologists have to refer to various knowledge bases worldwide and retrieve all information (e.g., drugs, clinical trials, and academic papers) related to a gene variant. However, oncologists find it difficult to search these knowledge bases comprehensively because there are multiple paraphrases containing abbreviations and foreign languages in their terminologies including diseases, drugs, and genes. In this paper, we propose a novel search method considering deep paraphrases, which helps oncologists retrieve essential annotation resources swiftly and effortlessly. Our method recursively finds paraphrases based on paraphrase corpora, expands a source document, and finally generates a paraphrase lattice. The proposed method also feedbacks beneficial information regarding the paraphrases applied for a search, which is useful for selecting search results and considering a query for the succeeding search. The results of an experiment demonstrated that our method could retrieve important annotation information that could not be retrieved using a conventional search system and simple paraphrasing. Additionally, annotation experts evaluated our method and found it to be practical.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbi.2021.103705DOI Listing

Publication Analysis

Top Keywords

paraphrase lattice
8
knowledge bases
8
search
6
method
5
retrieval oncology
4
oncology knowledge
4
knowledge base
4
base recursive
4
recursive paraphrase
4
annotation
4

Similar Publications

Information retrieval on oncology knowledge base using recursive paraphrase lattice.

J Biomed Inform

April 2021

Xcoo, Inc., Mitsuyama Bldg., 4F, 4-2-5, Hongo, Bunkyo-ku, Tokyo, Japan. Electronic address:

For annotation in cancer genomic medicine, oncologists have to refer to various knowledge bases worldwide and retrieve all information (e.g., drugs, clinical trials, and academic papers) related to a gene variant.

View Article and Find Full Text PDF

Word vector representations are a crucial part of natural language processing (NLP) and human computer interaction. In this paper, we propose a novel word vector representation, Confusion2Vec, motivated from the human speech production and perception that encodes representational ambiguity. Humans employ both acoustic similarity cues and contextual cues to decode information and we focus on a model that incorporates both sources of information.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!