Real-World Data Difficulty Estimation with the Use of Entropy.

Przemysław Juszczuk Jan Kozak Grzegorz Dziczkowski Szymon Głowania Tomasz Jach Barbara Probierz

Entropy (Basel)

Faculty of Informatics and Communication, Department of Machine Learning, University of Economics in Katowice, 1 Maja 50, 40-287 Katowice, Poland.

Published: December 2021

In the era of the Internet of Things and big data, we are faced with the management of a flood of information. The complexity and amount of data presented to the decision-maker are enormous, and existing methods often fail to derive nonredundant information quickly. Thus, the selection of the most satisfactory set of solutions is often a struggle. This article investigates the possibilities of using the entropy measure as an indicator of data difficulty. To do so, we focus on real-world data covering various fields related to markets (the real estate market and financial markets), sports data, fake news data, and more. The problem is twofold: First, since we deal with unprocessed, inconsistent data, it is necessary to perform additional preprocessing. Therefore, the second step of our research is using the entropy-based measure to capture the nonredundant, noncorrelated core information from the data. Research is conducted using well-known algorithms from the classification domain to investigate the quality of solutions derived based on initial preprocessing and the information indicated by the entropy measure. Eventually, the best 25% (in the sense of entropy measure) attributes are selected to perform the whole classification procedure once again, and the results are compared.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700715	PMC
http://dx.doi.org/10.3390/e23121621	DOI Listing

Publication Analysis

Top Keywords

entropy measure

real-world data

data difficulty

data

difficulty estimation

entropy

estimation entropy

entropy era

era internet

internet things

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered