With the advent and improvement of ontological dictionaries (WordNet, Babelnet), the use of synsets-based text representations is gaining popularity in classification tasks. More recently, ontological dictionaries were used for reducing dimensionality in this kind of representation (, Semantic Dimensionality Reduction System (SDRS) (Vélez de Mendizabal et al., 2020)). These approaches are based on the combination of semantically related columns by taking advantage of semantic information extracted from ontological dictionaries. Their main advantage is that they not only eliminate features but can also combine them, minimizing (low-loss) or avoiding (lossless) the loss of information. The most recent (and accurate) techniques included in this group are based on using evolutionary algorithms to find how many features can be grouped to reduce false positive (FP) and false negative (FN) errors obtained. The main limitation of these evolutionary-based schemes is the computational requirements derived from the use of optimization algorithms. The contribution of this study is a new lossless feature reduction scheme exploiting information from ontological dictionaries, which achieves slightly better accuracy (specially in FP errors) than optimization-based approaches but using far fewer computational resources. Instead of using computationally expensive evolutionary algorithms, our proposal determines whether two columns (synsets) can be combined by observing whether the instances included in a dataset (, training dataset) containing these synsets are mostly of the same class. The study includes experiments using three datasets and a detailed comparison with two previous optimization-based approaches.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11323001 | PMC |
http://dx.doi.org/10.7717/peerj-cs.2206 | DOI Listing |
PeerJ Comput Sci
July 2024
CINBIO - Biomedical Research Centre, CINBIO, Vigo, Pontevedra, Spain.
With the advent and improvement of ontological dictionaries (WordNet, Babelnet), the use of synsets-based text representations is gaining popularity in classification tasks. More recently, ontological dictionaries were used for reducing dimensionality in this kind of representation (, Semantic Dimensionality Reduction System (SDRS) (Vélez de Mendizabal et al., 2020)).
View Article and Find Full Text PDFJ Biomed Semantics
October 2021
Universidad de la Republica, Julio Herrera y Reissig 565, Montevideo, Uruguay.
Background: Medical experts in the domain of Diabetes Mellitus (DM) acquire specific knowledge from diabetic patients through monitoring and interaction. This allows them to know the disease and information about other conditions or comorbidities, treatments, and typical consequences of the Mexican population. This indicates that an expert in a domain knows technical information about the domain and contextual factors that interact with it in the real world, contributing to new knowledge generation.
View Article and Find Full Text PDFBMC Med Inform Decis Mak
December 2020
The University of Texas Health Science Center at Houston, Houston, TX, USA.
Background: The Kentucky Cancer Registry (KCR) is a central cancer registry for the state of Kentucky that receives data about incident cancer cases from all healthcare facilities in the state within 6 months of diagnosis. Similar to all other U.S.
View Article and Find Full Text PDFJ Biomed Semantics
June 2014
Research Center for Service Science, School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, Japan.
Background: Recently, exchanging data and information has become a significant challenge in medicine. Such data include abnormal states. Establishing a unified representation framework of abnormal states can be a difficult task because of the diverse and heterogeneous nature of these states.
View Article and Find Full Text PDFFront Physiol
October 2012
Genetic Resources Program, Centro Internacional de Mejoramiento de Maiz y Trigo Texcoco, Edo. de México, Mexico.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!