Detection and identification of misinformation and fake news is a complex problem that intersects several disciplines, ranging from sociology to computer science and mathematics. In this work, we focus on social media analyzing characteristics that are independent of the text language (language-independent) and social context (location-independent) and common to most social media, not only Twitter as mostly analyzed in the literature. Specifically, we analyze temporal and structural characteristics of information flow in the social networks and we evaluate the importance and effect of two different types of features in the detection process of fake rumors. Specifically, we extract epidemiological features exploiting epidemiological models for spreading false rumors; furthermore, we extract graph-based features from the graph structure of the information cascade of the social graph. Using these features, we evaluate them for fake rumor detection with 3 configurations: (i) using only epidemiological features, (ii) using only graph-based features, and (iii) using the combination of epidemiological and graph-based features. Evaluation is performed with a Gradient Boosting classifier on two benchmark fake rumor detection datasets. Our results demonstrate that epidemiological models fit rumor propagation well, while graph-based features lead to more effective classification of rumors; the combination of epidemiological and graph-based features leads to improved performance.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9093217PMC
http://dx.doi.org/10.3389/frai.2022.734347DOI Listing

Publication Analysis

Top Keywords

graph-based features
20
fake rumor
12
rumor detection
12
features
9
social media
8
rumors extract
8
epidemiological features
8
epidemiological models
8
combination epidemiological
8
epidemiological graph-based
8

Similar Publications

During the Covid-19 pandemic, the widespread use of social media platforms has facilitated the dissemination of information, fake news, and propaganda, serving as a vital source of self-reported symptoms related to Covid-19. Existing graph-based models, such as Graph Neural Networks (GNNs), have achieved notable success in Natural Language Processing (NLP). However, utilizing GNN-based models for propaganda detection remains challenging because of the challenges related to mining distinct word interactions and storing nonconsecutive and broad contextual data.

View Article and Find Full Text PDF

With the increasing availability of high-quality genome assemblies, pangenome graphs emerged as a new paradigm in the genomics field for identifying, encoding, and presenting genomic variation at both population and species levels. However, it remains challenging to truly dissect and interpret pangenome graphs via biologically informative visualization. To facilitate better exploration and understanding of pangenome graphs towards novel biological insights, here we present a web-based interactive Visualization and interpretation framework for linear-Reference-projected Pangenome Graphs (VRPG).

View Article and Find Full Text PDF

DTI-MHAPR: optimized drug-target interaction prediction via PCA-enhanced features and heterogeneous graph attention networks.

BMC Bioinformatics

January 2025

School of Information and Artificial Intelligence, Anhui Agricultural University, Changjiang West Road, Hefei, 230036, Anhui, China.

Drug-target interactions (DTIs) are pivotal in drug discovery and development, and their accurate identification can significantly expedite the process. Numerous DTI prediction methods have emerged, yet many fail to fully harness the feature information of drugs and targets or address the issue of feature redundancy. We aim to refine DTI prediction accuracy by eliminating redundant features and capitalizing on the node topological structure to enhance feature extraction.

View Article and Find Full Text PDF

Background: Deep learning (DL) has set new standards in cancer diagnosis, significantly enhancing the accuracy of automated classification of whole slide images (WSIs) derived from biopsied tissue samples. To enable DL models to process these large images, WSIs are typically divided into thousands of smaller tiles, each containing 10-50 cells. Multiple Instance Learning (MIL) is a commonly used approach, where WSIs are treated as bags comprising numerous tiles (instances) and only bag-level labels are provided during training.

View Article and Find Full Text PDF

The class imbalance problem is one of the difficult factors affecting the performance of traditional classifiers. The oversampling technique is the most common way to solve the class imbalance problem. They alleviate the performance impact of the class imbalance problem on traditional machine learning by augmenting minority instance feature representation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!