ArabBert-LSTM: improving Arabic sentiment analysis based on transformer model and Long Short-Term Memory.

Wael Alosaimi Hager Saleh Ali A Hamzah Nora El-Rashidy Abdullah Alharb Ahmed Elaraby Sherif Mostafa

Front Artif Intell

Faculty of Computers and Artificial Intelligence, South Valley University, Hurghada, Egypt.

Published: July 2024

Sentiment analysis also referred to as opinion mining, plays a significant role in automating the identification of negative, positive, or neutral sentiments expressed in textual data. The proliferation of social networks, review sites, and blogs has rendered these platforms valuable resources for mining opinions. Sentiment analysis finds applications in various domains and languages, including English and Arabic. However, Arabic presents unique challenges due to its complex morphology characterized by inflectional and derivation patterns. To effectively analyze sentiment in Arabic text, sentiment analysis techniques must account for this intricacy. This paper proposes a model designed using the transformer model and deep learning (DL) techniques. The word embedding is represented by Transformer-based Model for Arabic Language Understanding (ArabBert), and then passed to the AraBERT model. The output of AraBERT is subsequently fed into a Long Short-Term Memory (LSTM) model, followed by feedforward neural networks and an output layer. AraBERT is used to capture rich contextual information and LSTM to enhance sequence modeling and retain long-term dependencies within the text data. We compared the proposed model with machine learning (ML) algorithms and DL algorithms, as well as different vectorization techniques: term frequency-inverse document frequency (TF-IDF), ArabBert, Continuous Bag-of-Words (CBOW), and skipGrams using four Arabic benchmark datasets. Through extensive experimentation and evaluation of Arabic sentiment analysis datasets, we showcase the effectiveness of our approach. The results underscore significant improvements in sentiment analysis accuracy, highlighting the potential of leveraging transformer models for Arabic Sentiment Analysis. The outcomes of this research contribute to advancing Arabic sentiment analysis, enabling more accurate and reliable sentiment analysis in Arabic text. The findings reveal that the proposed framework exhibits exceptional performance in sentiment classification, achieving an impressive accuracy rate of over 97%.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11250580	PMC
http://dx.doi.org/10.3389/frai.2024.1408845	DOI Listing

Publication Analysis

Top Keywords

sentiment analysis

arabic sentiment

sentiment

arabic

analysis

transformer model

long short-term

short-term memory

arabic text

model

Similar Publications

Finding polarized communities and tracking information diffusion on Twitter: a network approach on the Irish Abortion Referendum.

R Soc Open Sci

January 2025

Mathematics Application Consortium for Science and Industry (MACSI), University of Limerick, Limerick, Ireland.

Caroline B Pena Pádraig MacCarron David J P O'Sullivan

The analysis of social networks enables the understanding of social interactions, polarization of ideas and the spread of information, and therefore plays an important role in society. We use Twitter data-as it is a popular venue for the expression of opinion and dissemination of information-to identify opposing sides of a debate and, importantly, to observe how information spreads between these groups in our current polarized climate. To achieve this, we collected over 688 000 tweets from the Irish Abortion Referendum of 2018 to build a conversation network from users' mentions with sentiment-based homophily.

View Article and Find Full Text PDF

Similar Publications

Positive sentiment and expertise predict the diffusion of archaeological content on social media.

Sci Rep

January 2025

Department of Sociology and Social Research, University of Trento, Trento, Italy.

Chiara Bonacchi Marta Krzyzanska Alberto Acerbi

This study investigates the dissemination of archaeological information on Twitter/X through the lens of cultural evolution. By analysing 132,230 tweets containing the hashtag #archaeology from 2021 to 2023, we examine how content and context-related factors influence retweeting behaviour. Our findings reveal that tweets with positive sentiment and non-threatening language are more likely to be shared, contrasting with the common negativity bias observed on social media.

View Article and Find Full Text PDF

Similar Publications

Characterizing the patient experience of physical restraint in psychiatric settings via a linguistic, sentiment, and metaphor analysis.

Sci Rep

January 2025

Laboratory of Neurolinguistics and Experimental Pragmatics (NEP), University School for Advanced Studies IUSS, Piazza della Vittoria 15, Pavia, 27100, Italy.

Cristiano Nichini Chiara Barattieri di San Pietro Biagio Scalingi Eleonora Alecci Luca Toschi

Physical Restraint (PR) is a coercive procedure used in emergency psychiatric care to ensure safety in life-threatening situations. Because of its traumatic nature, studies emphasize the importance of considering the patient's subjective experience. We pursued this aim by overcoming classic qualitative approaches and innovatively applying a multilayered semiautomated language analysis to a corpus of narratives about PR collected from 99 individuals across seven mental health services in Italy.

View Article and Find Full Text PDF

Similar Publications

BanglaBlend: A large-scale nobel dataset of bangla sentences categorized by saint and common form of bangla language.

Data Brief

February 2025

Department of Computer Science and Engineering. Daffodil International University, Bangladesh.

Umme Ayman Chayti Saha Azmain Mahtab Rahat Sharun Akter Khushbu

In the vibrant linguistic landscape of Bengali, spoken by millions in Bangladesh and India, the gap between saintly and common terms is culturally and computationally significant. Recognising this, we introduce BanglaBlend, a pioneering dataset created to capture these stylistic distinctions. BanglaBlend comes with 7350 annotated sentences, 3675 in saintly form and 3675 in common form, covering a crucial need in natural language processing (NLP) resources for Bangla.

View Article and Find Full Text PDF

Similar Publications

Conservation genomics within government led conservation planning: an Australian case study exploring cost and benefit for threatened flora.

Ann Bot

January 2025

Research Centre for Ecosystem Resilience, Botanic Gardens of Sydney, Sydney, NSW, 2000, Australia.

Chantelle A T Doyle Manuela Cascini Jia-Yee Samantha Yap Hannah Matthews Patricia M Hogbin

The importance of conserving plant genetic diversity has been recognised since the 1980's, but genetic research tools for improving conservation remain largely absent from standard planning. Using an Australian case study framework of the New South Wales Government's Saving our Species program, we outline the costs and benefits associated with conducting genomic analysis within a conservation strategy to inform for example, taxonomic resolution, targeted monitoring, translocations and ex situ collections. Despite a reported sentiment that costs are prohibitive, our study identified that where genetics reports have been provided (32 to date), the cost of genetic sampling, analysis and advice is less than 10% of the total Government investment (SoS program) and will continue decreasing proportionally throughout the years as other management occurs.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!