A novel corpus-based computing method for handling critical word-ranking issues: An example of COVID-19 research articles.

Int J Intell Syst

Department of Management Sciences R.O.C. Military Academy Kaohsiung Taiwan.

Published: July 2021

A corpus is a massive body of structured textual data that are stored and operated electronically. It usually combines with statistics, machine learning algorithms, or artificial intelligence (AI) technologies to explore the semantic relationship between lexical units, and beneficial when applied to language learning, information processing, translation, and so forth. In the face of a novel disease, like, COVID-19, establishing medical-specific corpus will enhance frontline medical personnel's information acquisition efficiency, guiding them on the right approaches to respond to and prevent the novel disease. To effectively retrieve critical messages from the corpus, appropriately handling word-ranking issues is quite crucial. However, traditional frequency-based approaches may cause bias in handling word-ranking issues because they neither optimize the corpus nor integrally take words' frequency dispersion and concentration criteria into consideration. Thus, this paper develops a novel corpus-based approach that combines a corpus software and Hirsch index (H-index) algorithm to handle the aforementioned issues simultaneously, making word-ranking processes more accurate. This paper compiled 100 COVID-19-related research articles as an empirical example of the target corpus. To verify the proposed approach, this study compared the results of two traditional frequency-based approaches and the proposed approach. The results indicate that the proposed approach can refine corpus and simultaneously compute words' frequency dispersion and concentration criteria in handling word-ranking issues.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8207067PMC
http://dx.doi.org/10.1002/int.22413DOI Listing

Publication Analysis

Top Keywords

word-ranking issues
16
handling word-ranking
12
proposed approach
12
novel corpus-based
8
novel disease
8
traditional frequency-based
8
frequency-based approaches
8
words' frequency
8
frequency dispersion
8
dispersion concentration
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!