TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction.

Ning Zhou Wenqian Shi Renyu Liang Na Zhong

Comput Intell Neurosci

School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China.

Published: February 2022

When TextRank algorithm based on graph model constructs graph associative edges, the co-occurrence window rules only consider the relationships between local terms. Using the information in the document itself is limited. In order to solve the above problems, an improved TextRank keyword extraction algorithm based on rough data reasoning combined with word vector clustering, RDD-WRank, was proposed. Firstly, the algorithm uses rough data reasoning to mine the association between candidate keywords, expands the search scope, and makes the results more comprehensive. Then, based on Wikipedia online open knowledge base, word embedding technology is used to integrate Word2Vec into the improved algorithm, and the word vector of TextRank lexical graph nodes is clustered to adjust the voting importance of nodes in the cluster. Compared with the traditional TextRank algorithm and the Word2Vec algorithm combined with TextRank, the experimental results show that the improved algorithm has significantly improved the extraction accuracy, which proves that the idea of using rough data reasoning can effectively improve the performance of the algorithm to extract keywords.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8808205	PMC
http://dx.doi.org/10.1155/2022/5649994	DOI Listing

Publication Analysis

Top Keywords

word vector

rough data

data reasoning

algorithm

textrank keyword

keyword extraction

extraction algorithm

algorithm word

vector clustering

based rough

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!