Automatic classification of text documents into a set of categories has a lot of applications. Among those applications, the automatic classification of biomedical literature stands out as an important application for automatic document classification strategies. Biomedical staff and researchers have to deal with a lot of literature in their daily activities, so it would be useful a system that allows for accessing to documents of interest in a simple and effective way; thus, it is necessary that these documents are sorted based on some criteria-that is to say, they have to be classified. Documents to classify are usually represented following the bag-of-words (BoW) paradigm. Features are words in the text-thus suffering from synonymy and polysemy-and their weights are just based on their frequency of occurrence. This paper presents an empirical study of the efficiency of a classifier that leverages encyclopedic background knowledge-concretely Wikipedia-in order to create bag-of-concepts (BoC) representations of documents, understanding concept as "unit of meaning", and thus tackling synonymy and polysemy. Besides, the weighting of concepts is based on their semantic relevance in the text. For the evaluation of the proposal, empirical experiments have been conducted with one of the commonly used corpora for evaluating classification and retrieval of biomedical information, OHSUMED, and also with a purpose-built corpus of MEDLINE biomedical abstracts, UVigoMED. Results obtained show that the Wikipedia-based bag-of-concepts representation outperforms the classical bag-of-words representation up to 157% in the single-label classification problem and up to 100% in the multi-label problem for OHSUMED corpus, and up to 122% in the single-label classification problem and up to 155% in the multi-label problem for UVigoMED corpus.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4592155PMC
http://dx.doi.org/10.7717/peerj.1279DOI Listing

Publication Analysis

Top Keywords

biomedical literature
8
wikipedia-based bag-of-concepts
8
automatic classification
8
single-label classification
8
classification problem
8
multi-label problem
8
classification
7
biomedical
5
documents
5
literature classification
4

Similar Publications

Recruiting participants for focus groups in health research: a meta-research study.

BMC Med Res Methodol

January 2025

Hannover Medical School (MHH), Institute for Epidemiology, Social Medicine and Health Systems Research, Carl-Neuberg-Street 1, 30625, Hannover, Germany.

Background: Focus groups (FGs) are an established method in health research to capture a full range of different perspectives on a particular research question. The extent to which they are effective depends, not least, on the composition of the participants. This study aimed to investigate how published FG studies plan and conduct the recruitment of study participants.

View Article and Find Full Text PDF

Dynamic X-ray Microtomography vs. Laser-Doppler Vibrometry: A Comparative Study.

J Assoc Res Otolaryngol

January 2025

Department of Otorhinolaryngology, Head and Neck Surgery, Inselspital, Bern University Hospital, University of Bern, 3010, Freiburgstrasse, Bern, Switzerland.

Purpose: There are challenges in understanding the biomechanics of the human middle ear, and established methods for studying this system show significant limitations. In this study, we evaluate a novel dynamic imaging technique based on synchrotron X-ray microtomography designed to assess the biomechanical properties of the human middle ear by comparing it to laser-Doppler vibrometry (LDV).

Methods: We examined three fresh-frozen temporal bones (TB), two donated by white males and one by a Black female, using dynamic synchrotron-based X-ray microtomography for 256 and 512 Hz, stimulated at 110 dB and 120 dB sound pressure level (SPL).

View Article and Find Full Text PDF

Comparing the Diagnostic Performance of Ultrasound Elastography and Magnetic Resonance Imaging to Differentiate Benign and Malignant Breast Lesions: A Systematic Review and Meta-analysis.

Acad Radiol

January 2025

Faculdade de Medicina e Ciências Biomédicas (FMCB), Universidade do Algarve. Campus de Gambelas, Edifício 2, 8005-139 Faro, Portugal (A.F.G., D.J., C.T., D.J., A.M., H.L.); Algarve Biomedical Center Research Institute (ABC-RI), Universidade do Algarve. Campus de Gambelas, Edifício 2, 8005-139 Faro, Portugal (A.M., E.P., H.L.).

Objective: The purpose of this systematic review and meta-analysis was comparing diagnostic performance of ultrasound elastography (UE), strain UE and shear wave elastography (SWE), with magnetic resonance imaging (MRI) in differentiating benign and malignant breast lesions.

Methods: Literature search of MEDLINE, Web of Science, SCOPUS and Google Scholar was performed in June 2023. Included studies used Breast Imaging Reporting and Data System (BI-RADS) and histopathology as reference standard.

View Article and Find Full Text PDF

Multiple Sclerosis (MS) is a heterogeneous autoimmune-mediated disorder affecting the central nervous system, commonly manifesting as fatigue and progressive limb impairment. This can significantly impact quality of life due to weakness or paralysis in the upper and lower limbs. A Brain-Computer Interface (BCI) aims to restore quality of life through control of an external device, such as a wheelchair.

View Article and Find Full Text PDF

Purpose: In locations where the proton energy spectrum is broad, lineal energy spectrum-based proton biological effects models may be more accurate than dose-averaged linear energy transfer (LET) based models. However, the development of microdosimetric spectrum-based biological effects models is hampered by the extreme computational difficulty of calculating microdosimetric spectra. Given a precomputed library of lineal energy spectra for monoenergetic protons, a weighted summation can be performed which yields the lineal energy spectrum of an arbitrary polyenergetic beam.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!