Multi-feature fusion framework for sarcasm identification on twitter data: A machine learning based approach.

PLoS One

Faculty of Computer Science and Information Technology, Department of Information Systems, University of Malaya, Kuala Lumpur, Malaysia.

Published: November 2021

Sarcasm is the main reason behind the faulty classification of tweets. It brings a challenge in natural language processing (NLP) as it hampers the method of finding people's actual sentiment. Various feature engineering techniques are being investigated for the automatic detection of sarcasm. However, most related techniques have always concentrated only on the content-based features in sarcastic expression, leaving the contextual information in isolation. This leads to a loss of the semantics of words in the sarcastic expression. Another drawback is the sparsity of the training data. Due to the word limit of microblog, the feature vector's values for each sample constructed by BoW produces null features. To address the above-named problems, a Multi-feature Fusion Framework is proposed using two classification stages. The first stage classification is constructed with the lexical feature only, extracted using the BoW technique, and trained using five standard classifiers, including SVM, DT, KNN, LR, and RF, to predict the sarcastic tendency. In stage two, the constructed lexical sarcastic tendency feature is fused with eight other proposed features for modelling a context to obtain a final prediction. The effectiveness of the developed framework is tested with various experimental analysis to obtain classifiers' performance. The evaluation shows that our constructed classification models based on the developed novel feature fusion obtained results with a precision of 0.947 using a Random Forest classifier. Finally, the obtained results were compared with the results of three baseline approaches. The comparison outcome shows the significance of the proposed framework.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8191968PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252918PLOS

Publication Analysis

Top Keywords

multi-feature fusion
8
fusion framework
8
sarcastic expression
8
constructed lexical
8
sarcastic tendency
8
feature
5
framework
4
framework sarcasm
4
sarcasm identification
4
identification twitter
4

Similar Publications

Accurate protein secondary structure prediction (PSSP) plays a crucial role in biopharmaceutics and disease diagnosis. Current prediction methods are mainly based on multiple sequence alignment (MSA) encoding and collaborative operations of diverse networks. However, existing encoding approaches lead to poor feature space utilization, and encoding quality decreases with fewer homologous proteins.

View Article and Find Full Text PDF

With the continuous development of intelligent transportation systems, traffic safety has become a major societal concern, and vehicle trajectory anomaly detection technology has emerged as a crucial method to ensure safety. However, current technologies face significant challenges in handling spatiotemporal data and multi-feature fusion, including difficulties in big data processing, and have room for improvement in these areas. To address these issues, this paper proposes a novel method that combines autoencoders, Mahalanobis distance, and dynamic Bayesian networks for anomaly detection.

View Article and Find Full Text PDF
Article Synopsis
  • E-commerce struggles with issues like content sameness and user anxiety about making purchases, prompting a study on perceived risk based on online reviews.
  • The study used a dataset of over 262,000 reviews and a predictive model that effectively identified 11 key factors impacting perceived risk, achieving high accuracy metrics (precision of 84%, recall of 86%, F1 score of 85%).
  • Key features influencing perceived risk vary by product type; for electronics, quality, functionality, and price are crucial, while for skincare, skin safety is the top concern, highlighting differences in risk perception.
View Article and Find Full Text PDF

Introduction: Emotion recognition using electroencephalography (EEG) is a key aspect of brain-computer interface research. Achieving precision requires effectively extracting and integrating both spatial and temporal features. However, many studies focus on a single dimension, neglecting the interplay and complementarity of multi-feature information, and the importance of fully integrating spatial and temporal dynamics to enhance performance.

View Article and Find Full Text PDF

This study proposes a rapid method for determining pregnancy status based on fingertip pulse signals. A finger pulse sensor collects data, which is processed into unified multimodal signals. The Bamboo-Net model, combining ResNet, LSTM, and 1D-CNN, extracts key features from time, frequency, and time-frequency domains.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!