Dataset for multimodal fake news detection and verification tasks.

Data Brief

Department of Computer Science, University of Pisa, Largo Bruno Pontecorvo, 3, 56127, Pisa, Italy.

Published: June 2024

The proliferation of online disinformation and fake news, particularly in the context of breaking news events, demands the development of effective detection mechanisms. While textual content remains the predominant medium for disseminating misleading information, the contribution of other modalities is increasingly emerging within online outlets and social media platforms. However, multimodal datasets, which incorporate diverse modalities such as texts and images, are not very common yet, especially in low-resource languages. This study addresses this gap by releasing a dataset tailored for multimodal fake news detection in the Italian language. This dataset was originally employed in a shared task on the Italian language. The dataset is divided into two data subsets, each corresponding to a distinct sub-task. In sub-task 1, the goal is to assess the effectiveness of multimodal fake news detection systems. Sub-task 2 aims to delve into the interplay between text and images, specifically analyzing how these modalities mutually influence the interpretation of content when distinguishing between fake and real news. Both sub-tasks were managed as classification problems. The dataset consists of social media posts and news articles. After collecting it, it was labeled via crowdsourcing. Annotators were provided with external knowledge about the topic of the news to be labeled, enhancing their ability to discriminate between fake and real news. The data subsets for sub-task 1 and sub-task 2 consist of 913 and 1350 items, respectively, encompassing newspaper articles and tweets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11070666PMC
http://dx.doi.org/10.1016/j.dib.2024.110440DOI Listing

Publication Analysis

Top Keywords

fake news
16
multimodal fake
12
news detection
12
news
9
social media
8
italian language
8
language dataset
8
data subsets
8
sub-task sub-task
8
fake real
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!