A survey on text classification: Practical perspectives on the Italian language.

Andrea Gasparetto Alessandro Zangari Matteo Marcuzzo Andrea Albarelli

PLoS One

Department of Environmental Sciences, Informatics and Statistics, Ca' Foscari University, Venice, Italy.

Published: July 2022

Text Classification methods have been improving at an unparalleled speed in the last decade thanks to the success brought about by deep learning. Historically, state-of-the-art approaches have been developed for and benchmarked against English datasets, while other languages have had to catch up and deal with inevitable linguistic challenges. This paper offers a survey with practical and linguistic connotations, showcasing the complications and challenges tied to the application of modern Text Classification algorithms to languages other than English. We engage this subject from the perspective of the Italian language, and we discuss in detail issues related to the scarcity of task-specific datasets, as well as the issues posed by the computational expensiveness of modern approaches. We substantiate this by providing an extensively researched list of available datasets in Italian, comparing it with a similarly sought list for French, which we use for comparison. In order to simulate a real-world practical scenario, we apply a number of representative methods to custom-tailored multilabel classification datasets in Italian, French, and English. We conclude by discussing results, future challenges, and research directions from a linguistically inclusive perspective.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9258888	PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0270904	PLOS

Publication Analysis

Top Keywords

text classification

italian language

datasets italian

survey text

classification

classification practical

practical perspectives

italian

perspectives italian

language text

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered