Link Prediction on a Network of Co-occurring MeSH Terms: Towards Literature-based Discovery.

Andrej Kastrin Thomas C Rindflesch Dimitar Hristovski

Methods Inf Med

Andrej Kastrin, PhD, Faculty of Information Studies, Ljubljanska cesta 31A, SI-8000 Novo Mesto, Slovenia, E-mail:

Published: August 2016

Objectives: Literature-based discovery (LBD) is a text mining methodology for automatically generating research hypotheses from existing knowledge. We mimic the process of LBD as a classification problem on a graph of MeSH terms. We employ unsupervised and supervised link prediction methods for predicting previously unknown connections between biomedical concepts.

Methods: We evaluate the effectiveness of link prediction through a series of experiments using a MeSH network that contains the history of link formation between biomedical concepts. We performed link prediction using proximity measures, such as common neighbor (CN), Jaccard coefficient (JC), Adamic / Adar index (AA) and preferential attachment (PA). Our approach relies on the assumption that similar nodes are more likely to establish a link in the future.

Results: Applying an unsupervised approach, the AA measure achieved the best performance in terms of area under the ROC curve (AUC = 0.76), followed by CN, JC, and PA. In a supervised approach, we evaluate whether proximity measures can be combined to define a model of link formation across all four predictors. We applied various classifiers, including decision trees, k-nearest neighbors, logistic regression, multilayer perceptron, naïve Bayes, and random forests. Random forest classifier accomplishes the best performance (AUC = 0.87).

Conclusions: The link prediction approach proved to be effective for LBD processing. Supervised statistical learning approaches clearly outperform an unsupervised approach to link prediction.

Download full-text PDF	Source
http://dx.doi.org/10.3414/ME15-01-0108	DOI Listing

Publication Analysis

Top Keywords

link prediction

link

mesh terms

literature-based discovery

link formation

proximity measures

unsupervised approach

best performance

prediction

approach

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!